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SECTION  I 


INTRODUCTION  AND  SUMMARY 

This  is  the  first  quarterly  technical  progress  report  for  Contract  No. 
DAAK  70-77-C-0248,  Prototype  Automatic  Target  Screener  (PATS). 

The  report  describes  results  of  the  first  three  months  of  a five  month 
Phase  I design  study  for  an  automatic  target  screener  that  can  operate 
with  first  generation  thermal  imagers  employing  common  module 
components.  The  period  covered  by  this  report  is  21  September  to 
31  December  1977. 

The  objective  of  this  effort  is  to  produce  a design  for  a prototype 
automatic  target  screener  (PATS).  The  screener  will  reduce  the  task 
loading  on  the  thermal  imager  operator  by  detecting  and  recognizing 
a limited  set  of  high  priority  targets  at  ranges  comparable  to  or  greater 
than  those  for  an  unassisted  observer.  A second  objective  is  to  provide 
enhancement  of  the  video  presentation  to  the  operator.  The  image 
enhancement  includes;  1)  automatic  gain/brightness  control,  to  relieve 
the  operator  of  the  necessity  to  continually  adjust  the  display  gain  and 
brightness  controls;  and  2)  DC  restoration,  to  eliminate  artifacts 
resulting  from  AC  coupling  of  the  IR  detectors. 


The  report  consists  of  three  principal  sections.  Section  II  describes  the 
effort  under  the  image  enhancement  part  of  the  study;  Section  III  describes 
the  target  screener  design  activities;  and  Section  IV  reports  on  the 


interframe  analysis  task.  Plans  for  the  next  three  month  reporting 
period  are  included  in  Section  V. 

The  image  enhancement  portion  of  PATS  will  consist  of  circuitry  to  operate 
on  the  Common  Module  FLIR  (MODFLIR)  video  outpuf  signal.  This 
circuitry  will  provide  global  gain  and  bias  control  in  the  form  of  feed- 
back to  the  MODFLIR  to  maintain  the  signal  within  the  dynamic 
range  of  the  electro-optical  multiplexer.  The  global  gain  and  bias  control 
circuit  preliminary  design  has  been  completed  and  will  be  implemented 
upon  receipt  of  the  GFE  MODFLIR. 

Image  enhancement  will  also  include  local  area  gain  and  brightness  control 
to  enhance  local  variations  of  contrast  and  compress  the  overall  scene 
dynamic  range  to  match  that  of  the  display.  This  circuitry  has  been 
completed  and  examples  of  its  performance  on  video  taped  thermal  image 
data  are  included,  along  with  the  circuit  description,  in  Section  II, 

The  third  image  enhancement  circuit  is  for  DC  restoration,  to  eliminate 
the  streaking  associated  with  loss  of  line-to-line  correlation  on  the 
displayed  imaige  because  of  the  AC  coupling  of  the  detector  channels.  We 
have  modified  our  original  implementation  approach  to  DC  restoration 
which  was  based  on  measuring  the  histogram  of  the  line-to-line  intensity 
differences.  The  new  scheme  provides  a much  simpler  implementation 
of  basically  the  same  approach  and  is  also  described  in  Section  II.  A 
breadboard  version  of  the  simpler  concept  has  been  designed  and  is 
being  constructed. 
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Sections  III  and  IV  summarize  the  effort  to  date  on  the  target  screener 
design  task.  Section  III  describes  the  results  of  the  subtasks  involved 
with  detecting  and  recognizing  targets  within  a single  frame  of  imagery. 
Section  IV  on  interframe  analysis  documents  research  on  new  algorithms 
to  improve  the  target  screener  performance  by  correlating  image  features 
over  a sequence  of  frames. 

The  target  screening  tasks  reported  in  Section  III  are  data  preparation 
and  analysis,  image  segmentation,  and  feature  extraction.  Remaining 
are  the  object  classification  and  target  decision  tasks.  This  effort 
is  somewhat  behind  the  planned  schedule  primarily  because  of 
increased  effort  under  the  data  preparation  task  over  the  planned 
effort. 

The  data  preparation  task  was  expanded  to  include  digitization  of  taped 
FLIR  imagery,  A total  of  260  frames  of  FJjIR  imagery  has  been 
digitized,  annotated,  and  debanded  where  necessary.  These  images 
contain  tanks,  armored  personnel  carriers  (APCs),  and  some  trucks 
and  represent  less  than  50  percent  of  the  estimated  quantity  of  imagery 
required  for  target  screener  design,  training,  and  testing.  Additional 
digitized  imagery  will  be  added  to  this  data  base  as  it  becomes  available. 
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SECTION  II 

IMAGE  ENHANCEMENT 

This  section  reports  the  progress  on  the  image  enhancement  tasks  of 
the  Prototype  Automatic  Target  Screener  (PATS).  Specifically,  the 
tasks  addressed  are;  synthetic  DC  restoration,  global  gain  and  bias 
control,  and  local  area  gain  and  brightness  control. 

Figure  1 is  a functional  diagram  showing  these  three  functions  in  PATS. . 
On  the  DC  restoration  task,  we  have  analyzed  the  hardware  requirements 
^ for  the  histogram  technique  of  S3mthetic  DC  restoration  (similar  to  the 

NVL  scheme)  which  we  proposed.  An  alternate  scheme  has  also  been 
developed.  This  scheme  is  all-analog  and  appears  to  be  more  readily 
implementable  than  the  histogram  approach.  Preliminary  hardware 
designs  for  both  approaches  and  tradeoffs  are  discussed.  The  local 
area  gain  and  brightness  function  has  already  been  breadboarded  and 
is  now  being  tested.  We  include  schematics  of  this  design  and  sample 
imagery  which  has  been  processed  through  this  real  time  hardware. 

The  global  gain  and  bias  approaches  are  discussed  briefly.  This 
technique  is  not  computer  simulated  because  it  involves  modeling  the 
MODFLIR  hardware.  At  this  point  we  feel  that  the  microprocessor- 
based  design  proposed  for  the  global  control  is  flexible  enough  to 
' breadboard  as  soon  as  we  acquire  the  FLIR. 


i 
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Functional  Image  Enhancement  System  on  PATS 
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SYNTHETIC  DC  RESTORATION 

Description  of  the  Problem 

The  parallel  detectors  in  the  MODFLIR  are  AC  coupled  to  the  FLIR  elec- 
tronics. The  practice  of  AC  coupling  arises  for  three  reasons:  1)  the 
detectors  are  photoconductive  and  AC  coupling  eases  the  biasing  considera- 
tions on  the  detector:  2)  there  is  a need  to  limit  the  1/f  noise  in  the 
detectors;  and,  3)  the  elimination  of  DC  by  AC  coupling  subtracts  the 
average  background  component  of  the  line  from  the  video  and  thus 
increases  the  contrast  sensitivity  of  the  displayed  image.  But  AC 
coupling  of  the  detectors  also  results  in  some  attendant  degradations 
on  the  resultant  displayed  image.  The  following  paragraphs  briefly 
review  the  AC  coupling  degradation  problem  in  the  context  of  the 
MODFLIR. 


AC  coupling  degradations  can  be  characterized  by  two  essentially 
separate  phenomena: 

• The  transient  effects  - imdershoot  at  a rapid  transition  of 
temperature. 

• Steady  state  effects  - loss  of  line-to-line  correlation  because 
of  the  loss  of  the  average  value  of  the  line  (streaking  and 
droop). 

The  transient  effect  is  shown  in  Figure  2,  where  the  rect  function  suffers 
a droop  and  an  equal  undershoot.  As  the  following  analysis  shows,  the 
transient  droop  and  undershoot  problems  in  one  scan  line  are  minor  for 
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the  long  time  constants  of  the  RC  coupling  circuitry  in  the  MODFLIR,  and 
do  not  need  to  be  corrected. 


Figure  2.  Rect  Function  Response  of  an  RC  Circuit 

Figure  2 shows  the  basic  RC  circuit  that  approximates  the  low  frequency 
response  of  the  entire  FLIR  electronics  from  the  detector  to  the  light 
emitting  diode  (LED)  stage.  Step  response  of  such  a circuit  is  given  by 
a decaying  exponential  as  follows; 

U (t)  - U(t  - t ) 
o o 

The  response  of  the  circuit  to  a rect  function  (that  more  closely 
approximates  a hot  spot)  can  be  derived  from  the  unit  step  response 
and  is  shown  in  Figure  2.  The  importsmt  feature  in  this  response  is 
the  droop  t of  the  top  of  the  rect  function  and  the  corresponding  equal 

7 


I 


undershoot  6 that  follows  the  rect  function.  The  time  constant  of  the 
equivalent  RC  circuit  (T  ■ RC)  is  related  to  the  lower  3 dB  cutoff  frequency 
of  the  system  (detector  to  LED)  of  approximately  8 Hz  by* 


► 


T a — m 1/8  sec 
^3dB 

The  most  serious  degradation  of  this  kind  will  obviously  occur  when  the 
rect  function  duration  T is  close  to  the  scan  line  time,  that  is,  1/60  sec 
(for  a parallel  scan  system).  It  is  instructive  to  estimate  the  droop  6 as  a 
percentage  of  the  rect  fimctlon  for  the  MODFLIR. 

6 » (1  - e"'^^'^)  X 100% 

For  T » 1/60  sec,  and  r ■ 1/8  sec,  the  above  expression  gives  ■ 12.5%, 

which  indicates  that  the  transient  DC  droop  and  undershoot  is  not  at  all 
severe.  Very  seldom  could  we  have  the  extreme  case  of  a hot  spot 
extending  the  entire  length  of  the  scan  line. 

More  serious  by  far  than  the  transient  effects  is  the  fact  that  the  parallel 
AC  coupled  detectors  in  the  MODFLIR  are  not  DC  restored  or  clamped  at 
the  end  of  each  scan  line.  When  the  image  is  viewing  a stable  scene,  the 
average  value  of  the  video  In  each  channel  becomes  zero  within  a few 
fields'  time,  independently  of  the  neighboring  channels.  This  has  serious 


*Thls  is  a conservative  estimate  of  the  lower  3 dB  cutoff  of  the  MODFLIR. 
The  true  value  is  somewhat  lower.  However,  the  droop  is  even  smaller 
at  3 Hz  than  at  8 Hz  and  this  analysis,  showing  that  the  droop  in  one  scan 
line  is  Insignificant,  holds. 
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consequences  when  the  average  scene  temperature  is  changing  rapidly 
in  a direction  perpendicular  to  the  scan,  because  this  difference  is  then 
lost  in  the  display.  A classic  example  is  the  loss  of  horizon  definition 
because  the  detectors  scanning  the  cold  sky  and  those  scanning  the  hot 
ground  yield  the  same  average  video  signal  values. 

The  second  steady  state  effect  is  the  "streaking"  produced  by  a very  hot 
or  cold  target  against  a uniform  background.  This  again  happens  in  the 
parallel  scan  FLIR  with  no  DC  restore.  Figures  3b  and  4b  show  this 
effect  on  test  targets  (Figures  3a  and  4a)  degraded  to  approximate  the 
loss  of  DC  on  the  individual  scan  lines  (along  with  the  transient  effects). 
A common  bias  is  added  to  all  the  lines  to  make  the  video  display 
compatible.  Note  the  streaking  evident  in  the  degraded  images.  This 
represents  the  most  severe  AC  coupling  degradation,  as  the  presence 
of  a hot  spot  on  a scan  line  can  lower  the  rest  of  the  scan  line  video 
below  blacker  than  black  on  the  display  and  obscure  8uiy  detail  present 
in  the  line.  This  happens  because  the  average  value  of  a scan  line 
in  the  top  of  Figure  4a  is  lower  than  the  average  value  of  a line  with 
tlie  white  Jiot  portion  on  it.  With  the  loss  of  the  DC,  the  average  values 
of  these  lines  in  Figure  4b  are  now  equal,  depressing  the  lower  line  with 
respect  to  the  upper  one.  This  is  also  the  reason  for  the  shading  on 
the  hot  and  cold  targets  in  Figure  4b. 
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a.  Original  test  pattern 


b.  Degraded  due  to  AC  coupling 


c.  Synthetic  DC  restoration 

Figure  3.  Synthetic  DC  Restoration  of  Test  Target  No.  1 
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a.  Original  test  pattern 


b.  Degraded  due  to  AC  coupling 


c.  Synthetic  DC  restoration 

Figure  4.  Synthetic  DC  Restoration  of  Test  Target  No.  2 
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Solutions  to  the  DC  Restore  Problem 


Inverse  Filtering --This  is  the  most  tempting  solution  at  first  sight.  This 
involves  designing  an  inverse  sampled  data  filter  to  invert  the  high  pass  RC 
transfer  function  shown  in  Figure  2,  Since  this  inverse  filter  has  a 
singularity  at  very  low  frequencies  (DC)  the  proposed  solutions  are  not 
strict  inverse  filters,  but  simply  boost  the  low  frequencies  so  that  the 
lower  3 dB  cutoff  frequency  is  now  lower  than  before.  This  will  lessen  the 
droop  in  the  single  scan  line  and  increase  the  time  of  settling  (on  a 
stable  scene  lasting  several  frames).  But  complete  recovery  of  the 
very  low  frequency  information  necessary  to  maintain  vertical  DC 
correlation  is  impossible  without  making  the  system  unstable  and 
extremely  sensitive  to  low  frequency  noise.  We  discarded  this  approach 
to  DC  restoration  for  the  following  reasons; 

• It  will  not  cure  the  steady  state  loss  of  vertical  DC 
correlation  on  scenes  lasting  more  than  a few  frame 
times.  Therefore,  the  FLIR  viewing  a test  pattern, 
for  example,  would  still  show  the  same  sti^aking  as 
before. 

• It  requires  access  to  each  detector  channel,  because 
each  channel  has  to  have  its  own  inverse  filter.  This 
involves  modifying  the  common  module. 


T.  Noda,  et  al. , "Final  Report  for  Experimental  Development  of  a FLIR 
Sensor  Processor,  " NORTHROP,  Contract  No.  DAAG53-76-C-0188, 
January  31,  1977. 


• Each  channel  inverse  filter  has  to  exactly  compensate  for  the 
capacitor/ resistor  network  on  that  channel.  Errors  will 
lead  to  instability. 

• It  will  make  the  system  very  sensitive  to  l/f  noise  and  to 
amplifier  drift. 

Synthetic  DC  Restoration --This  approach  uses  the  properties  of  the 
thermal  scene  to  artificially  restore  the  vertical  correlation  lost  due 
to  AC  coupling.  We  have  analyzed  two  techniques  for  synthetic  DC 
restoration  through  computer  simulation.  One  is  a histogramming 
approach  and  the  second  is  an  all-analog  approach.  Both  use  the 
fact  that  background  in  a scene  changes  slowly  in  FLIR  scenes  from 
one  scan  line  to  the  next. 

2 3 

1.  Histogramming  Approach  ’ to  Synthetic  DC  Restoration-- 
We  recognize  that  the  background  in  a scene  varies  slowly 
from  line  to  line.  In  the  AC  coupled  video,  the  presence 
of  a significant  hot  target  in  one  line  causes  that  line  to 
be  depressed  with  respect  to  the  previous  line.  Therefore, 
the  pixel-to-pixel  differences  of  the  two  lines  will  be 
predominantly  distributed  at  or  near  the  average  DC  shift 


This  technique  with  minor  differences  was  independently  conceived  by: 

2 

P.  K.  Raimondi,  "Pseudo-DC  Restoration  Using  Histogramming,"  NVL, 
(patent  applied  for). 

3 

P.  M.  Narendra,  et  al.,  "Final  Report  on  Automated  Image  Enhancement 
Techniques  for  Second  Generation  FLIR,  " Honeywell,  Contract  No. 
DAAG53-76-C-0195,  December  1977. 
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of  the  second  line  with  respect  to  the  first.  Figures  5a  and  b 
illustrate  this.  By  recognizing  the  peak  in  the  histogram, 
we  can  identify  the  shift  of  the  DC  level  and  add  it  to  the 
second  line.  The  second  line  now  serves  as  the  reference 
for  the  next  line,  and  so  on  down.  Peak  detection  of  this 
histogram  may  not  be  very  suitable,  because  this  is  a 
sparse  histogram  (512  levels  and  about  500  pixels  in  the  scan 
line).  Therefore,  finding  the  true  peak  of  the  histogram 
would  involve  smoothing  of  the  histogram.  Also,  when  there 
is  no  change  in  the  DC  levels  between  successive  scan  lines, 
peak  seeking  will  not  give  a definitive  solution. 

Actually,^  we  need  only  to  detect  the  median  of  the  histogram 
(see  Figure  5c)  to  get  an  estimate  of  the  DC  shift.  The 
median  is  a more  robust  measure  because  it  finds  the  peak 
if  one  exists,  and  yields  zero  (i.  e. , no  shift)  if  no 
significant  peak  exists. 

Figure  6 is  a functional  block  diagram  of  this  algorithm.  A differ- 
ence histogram  is  formed  by  subtracting  the  current  line  from  the 
previous  filtered  line.  The  median  of  the  histogram  is  added  to  the 
current  delayed  line  to  get  the  current  synthetic  DC  restored  line. 
The  line -to -line  background  correlation  is  thus  restored.  Figures 
3c  and  4c  are  the  degraded  test  patterns  of  Figures  3b  and  4b  re- 
stored using  a simulation  of  this  algorithm.  We  note  that  the  line- 
to-line  correlation  has  been  completely  restored  and  even  the 
gradual  shading  in  the  middle  of  the  pattern  has  been  eliminated. 
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DC  SHIFT 


a.  Shows  the  shifting  of  background  level  because  of  the  presence  of  a very 


b.  Histogram  of  the  difference  of  line  1 and  line  2 


Figure  5.  Llne-to-Llne  Pixel  Differences  from  AC  Coupling 
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PREVIOUS  FILTERED  LINE 


Figure  6,  Functional  Block  Diagram  of  the  Histogram  DC  Restore  Algorithm 

2,  All-analog  Approach  to  Synthetic  DC  Restoration--The 

histogram  approach  has  proved  very  effective  in  computer 
simulations  as  seen  above,  but  real  time  implementation 
design  hrs  proved  difficult.  We  have  to  find  the  median  of 
the  512  point  histogram  in  the  scan  retrace  time  (7  to  10  jusec)*, 
which  makes  it  necessary  to  use  very  high  speed,  high  power 
dissipation  memories  as  well  as  a high  speed  analog /digital 
converter.  Therefore,  we  have  derived  an  alternate  all- 
analog  scheme  that  is  very  similar  in  principle  to  the  histo- 
gramming  approach  but  uses  a simpler  technique  to  estimate 
the  DC  shift  of  the  background.  This  approach  does  not 

Seven  nsec  for  an  875 -line  system  and  10  psec  for  a 525 -line  system. 
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require  his  tog  ramming  and  requires  only  analog  integrators, 
which  makes  for  a very  simple  implementation.  The  concept 
has  been  computer  simulated,  and  works  just  as  well  as  the 
histogram  on  the  test  patterns  on  which  NVL  has  proposed  to 
exercise  the  scheme. 

Description  of  the  Algorithm 

Referring  to  Figure  7,  a hot  target  appears  on  Scan  Line  2 that  was  not 
present  in  Scan  Line  1.  Because  of  AC  coupling,  the  average  value  of  the 
video  on  Scan  Lines  1 and  2 is  now  the  same.  Therefore,  the  background 
intensities  on  Scan  Line  2 have  dropped  by  a constant  offset  relative  to 
Scan  Line  1.  This  shift  causes  the  lines  with  the  hot  target  to  appear 
darker  than  the  rest,  resulting  in  a streak.  We  need  to  estimate  this 
shift  in  the  background  and  add  it  to  the  second  scan  line  so  that  the 
streak  around  the  hot  (or  for  that  matter,  very  cold)  target  disappears. 

In  Figure  7,  if  we  average  the  positive  and  negative  parts  of  the  difference 
signal  separately,  the  offset  we  want  should  be  one  of  these  two  averages. 
We  need  the  average  that  corresponds  to  the  background.  This  is  the 
average  that  lasts  the  longest  over  the  scan  line  (targets  are  smaller  than 
background  areas). 

Figure  8 shows  the  functional  schematic  of  this  process.  The  positive 
and  negative  averaging  is  done  by  integrating  over  one  scan  line  in 
analog  integrators  and  dividing  by  the  corresponding  counts.  The  choice 
of  the  two  averages  is  made  according  to  which  of  the  two  counters  (+  or  -) 
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Figure  7.  The  Analog  Approach  to  Synthetic  DC  Restoration 
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Figure  8.  All-Analog  Scheme  Conceptual  Block  Diagram 


is  greater  at  the  end  of  the  scan  line.  This  shift  is  added  to  the  cumulative 
offset  (initialized  to  zero  at  the  beginning  of  the  field).  The  cumulative 
offset  is  added  to  the  current  scan  line  (now  delayed).  Note  that  only  the 
analog  division  and  selection  of  the  two  averages  need  be  made  at  the  end 
of  the  scan  line  (in  7 to  10  psec).  In  the  histogram  scheme,  we  would 
need  to  step  through  all  512  histogram  bins  to  find  the  median  in  this 
period.  Further,  the  scheme  involves  no  analog -to -digital  (A/D) 
converters  and  very  little  digital  logic. 

The  scheme  assumes  that  the  hot  spots  appear  on  less  than  one -half  of 
the  scan  line  (a  valid  assumption)  and  the  background  does  not  change 
rapidly  from  one  scan  line  to  the  next.  These  are  the  same  assumptions 
the  original  histogram  approach  made.  The  scheme  was  computer 
simulated  on  the  two  candidate  test  patterns  in  Figures  3 and  4 and 
works  just  as  well  as  the  histogramming  approach. 

Hardware  Implementation  of  Synthetic  DC  Restore  Schemes 

Histogram  Approach --The  functional  diagram  of  the  histogram  DC  restore 
algorithm  was  shown  in  Figure  6.  The  basic  functions  are  to  compute  the 
pixel-by-pixel  differences  of  a line  from  the  corresponding  pixels  on 
the  previous  DC  restored  line,  and  find  the  median  of  the  differences. 

The  median  represents  the  DC  shift  the  background  suffered  in  the 
presence  of  a major  scene  change  on  successive  scan  lines,  and  is 
added  back  to  the  current  line  to  restore  the  line -to -line  correlation. 

One  way  to  compute  the  median  is  to  histogram  the  differences.  At  the 
end  of  the  scan  line,  the  median  is  computed  by  accumulating  the 
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histogram  sequentially  in  bins  until  one-half  the  total  line  population 
is  reached.  The  corresponding  bin  then  gives  the  median  of  the 
difference  distribution. 


This  fimction  can  be  performed  in  the  hardware  implementation  shown  in 
Figure  9.  The  input  analog  signal  is  differenced  from  the  previous  DC 
restored  (and  delayed)  line  and  A/D  converted  to  eight  bits.  This  result 
addresses  a 256  x 8 random  access  memory  (RAM)  which  is  used  as 
a histogram  counter.  To  avoid  adders,  a read  only  memory  (ROM)  is 
used  to  increment  the  bin  counts.  The  ROM  forms  a 256  entry 
lookup  table  with  the  output  simply  being  the  input  +1.  The  read 
update  and  write  cycles  for  a given  point  should  be  completed  before 
the  next  point  is  converted  by  the  A/D  unit. 

This  process  continues  for  each  of  about  500  samples  along  a line. 

When  the  blanking  and  retrace  time  at  the  line's  end  is  reached,  the 
results  of  successive  RAM  are  added  until  the  histogram  median  is 
reached.  This  value  is  latched  onto  the  inputs  of  a digital-to -analog  (D/A) 
converter  and  the  D/A  output  is  added  to  all  samples  along  the  delayed 
input  video.  The  process  continues  recursively  as  each  new  line 
comes  along. 

Analysis  of  the  Histogram  Implementation — It  can  be  shown  that  for  a 
512-point  histogram  with  eight  bits  of  resolution,  an  A/D  conversion 
speed  of  10  to  16  MHz  is  needed  for  the  525-  to  875 -line  video  systems. 
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Figure  9.  DC  Restoration  Circuit;  Block  Diagram 
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Also,  the  scanning  frequency  during  retrace*  for  computing  the  median 
must  range  from  25  to  37  MHz.  To  accomplish  such  speeds  will  require 
ultra>fast  A/D  converters  using  the  "flash"  conversion  technique,  and 
high  speed  bipolar  memories . The  memory  will  dissipate  five  watts  of 
power  and  occupy  a 6 in.  x 4 in.  board.  The  A/D  and  sample/hold 
unit  will  need  most  of  a 6 in.  x 4 in.  card  and  cost  $2  K,  Support  logic, 
the  ROM,  the  D/A,  adders,  etc.,  will  occupy  at  least  one  other  card 
and  dissipate  considerable  power.  Thus,  a total  of  three  6 in,  x 4 in. 
cards  will  be  needed. 

The  histogram  method  can  be  made  more  feasible  for  hardware  imple- 
mentation by  decreasing  the  quantization  resolution  and/or  the  samples/ 
line,  or  by  changing  the  algorithm.  Decreasing  the  resolution  will 
reduce  the  ability  to  DC  restore  low-level  input  signals.  A resolution 
of  six  bits  is  a minimum.  Decreasing  the  samples/line  to  64  will 
still  require  a 1 to  2 MHz  A/D,  which  occupies  comparable  real  estate 
with  the  5 12 -sample /line  A/D;  less  than  64  samples  could  cause 
significant  errors.  The  only  algorithm  modification  possible  is  the 
addition  of  another  line  delay  to  decrease  the  scan  rate.  Doing  this 
will  require  a double  memory  and  no  great  reduction  in  overall  system 
speed.  These  are  the  reasons  why  we  went  to  the  analog  approach. 

Implementation  of  the  Analog  Scheme — A block  diagram  of  the  hardware 
implementation  for  the  all -analog  scheme  described  in  Section  II  is 
shown  in  Figure  10.  The  raw  input  video  is  delayed  one  horizontal 


Seven  psec  for  an  87S-line  system  and  10  psec  for  the  525 -line  system. 
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DC  RESTORED 


Figure  10.  All -Analog  Scheme  Implementation  Block  Diagram 


scan  line  time  by  the  CC0321A  module.  The  difference  between  the 
current  and  the  previous  line  is  taken  by  the  summer  El.  The  analog 
switch  Si  selects  the  output  of  El  during  the  line  trace  time  and 
ground  (zero)  during  the  horizontal  blanking  intervai.  The  ou^ut  of 
Si  goes  to  a positive  half-wave  rectifier.  The  output  of  the  haif-wave 
rectifier  is  added  to  the  negative  of  its  input  by  E2  to  perform  a 
negative  half-wave  rectification.  Both  the  output  of  the  positive 
rectifier  and  e2  are  time  integrated  by  Integrators  1 and  2;  these 
integrators  (as  well  as  Integrator  3)  are  cleared  at  the  end  of  each 
blanking  interval.  The  outputs  of  the  integrators  ( M+  and  M-)  go  to 
an  analog  multiplexer  to  be  selected  by  the  logical  output  from 
Comparator  2. 

Comparator  1 provides  a logical  "1"  when  the  difference  information 
from  Si  is  positive  (it  is  set  to  a logical  "0"  during  blanking).  The 
output  of  the  comparator  is  integrated  by  Integrator  3;  the  output  is 
proportional  to  the  time  that  the  video  difference  signal  is  positive. 

At  the  end  of  the  scan  line,  subtracting  the  positive  time-on  (T+) 
from  the  maximum  time-on  (Tmax)  gives  the  time  the  video  difference 
was  negative  (T-).  Comparator  2 gives  a logic  "l"  output  if  T+  > T-, 
and  a "0"  if  T+  < T-.  This  comparator  output  selects  the  maximum 
of  T+  and  T-  to  be  selected  by  Multiplexer  2 and  the  corresponding  inte- 
gral value  by  Multiplexer  1 (l.e.,  M+  goes  with  T+,  M-  goes  with  T-). 

The  multiplexer  outputs  ( Mand  T)  are  divided  and  are  input  to  the 
summer  E4.  The  E4  loop  keeps  a total  of  the  divider  outputs  for 
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computing  the  DC  shift  estimate.  Two  sample  and  hold  units  (S/H  1 and 
S/H  2)  are  used  in  a master/slave  configuration  as  an  analog  adder 
with  a line  delay.  The  clock  on  S/H  1 (CKl)  samples  during  the 
horizontal  retrace  time.  The  clock  on  S/H  2 (CK2)  follows  CKl 
and  is  non -overlapping.  The  output  of  S/H  i serves  as  the  DC  shift 
estimate  and  is  added  to  the  delayed  video  by  summer  S5.  The  S/H 
units  are  cleared  every  frame  using  S2  to  prevent  an  unstable  loop. 

All  components  in  the  circuit  are  analog,  except  for  one-shots  for 
clocking  the  sample/hold  units.  The  summers  are  made  with  LM318 
type  operational  amplifiers  (op-amps)  with  20  MHz  small  signal 
bandwidths.  The  Integrators  1 and  2 are  made  with  low  offset 
LM356  BI-FET  amplifiers  with  20  V/Msec  slew  rates  and  4 MHz 
bandwidths.  Integrator  3 is  made  with  an  LM318  op-amp  with 
maximized  slew  rate  (150  V/^sec)  and  minimized  settling  time 
(<  1 fxsec).  All  three  integrators  use  4066  CMOS  analog  switches 
for  clearing  the  integrator  during  retrace.  The  Comparators  1 and  2 
are  an  LM319  dual  comparator  and  have  80  nsec  rise  times.  The  out- 
put of  Comparator  1 is  buffered  through  a gate  for  strobing  during 
blanking  and  producing  approximately  a zero  or  five  volt  output. 

The  analog  switches  and  multiplexers  are  all  low -transient  4066 
type  CMOS  switches.  The  divider  is  an  Analog  Devices  53 IK 
trans conductance  analog  multiplier /divider  with  2 nsec  settling  time 
at  one  percent  accuracy.  This  divider  can  be  substituted  for  since 
the  divider  input  is  always  bounded  from  1/2  Tmax  to  Tmax,  and 
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this  small  range  will  not  significantly  affect  most  divider  settling  times. 
The  san^le/hold  units  are  Oatel  SHM-LM-2  types  with  4 psec 
acquisition  time. 


The  approximate  power  required  (not  including  the  312A)  is  2W, 

The  circuit  should  fit  on  one  6 in.  x 4 in.  card.  The  cost  of  the  parts 
for  the  circuit  (without  the  CCD)  is  $200.  The  CCD  unit  and  support 
circuitry  will  dominate  the  cost. 

LOCAL  AREA  GAIN/BRIGHTNESS  CONTROL  (LAGBC) 

Referring  to  the  system  diagram  for  the  gain/brightness  control  in 
Figure  1,  the  local  area  gain/brightness  control  appears  prior  to 
the  display  of  the  video,  after  the  synthetic  DC  restore  has  been 
accomplished.  It  performs  the  following  functions: 


• Varies  the  local  average  brightness  of  the  displayed  image 
(bias),  so  that  overall  dynamic  range  of  scene  is  compressed. 

• Enhances  local  variations  above  the  contrast  sensitivity 
threshold  of  the  human  eye. 

• Automatically  fits  the  intensity  extremes  in  the  enhanced 
video  scene  to  the  display  limits. 


A functional  description  of  the  algorithm  is  shown  in  Figure  11.  The 
image  intensity  at  each  point  is  transformed  based  on  local  area  statistics — 
the  local  mean  and  the  local  standard  deviation  9^^  computed  on  a 
local  area  surrotmding  the  point.  The  transformed  intensity  is  then: 
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Figure  11.  Functional  Flow  Description  of  Local  Area 

Gain /Brightness  Control  (LAGBC)  Algorithm 
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where  M is  the  global  mean. 

In  words,  the  local  area  mean  is  first  subtracted  from  the  image  at  every 
point.  A variable  gain  is  applied  to  the  difference  to  amplify  the  local 
variation.  A portion  of  the  local  mean  then  added  back  to  restore 
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the  subjective  quality  of  image.  The  local  gain  G..  is  itself  locally  adaptive, 
being  proportional  to  M,  to  satisfy  psychovisual  considerations  (Weber's 
Law);  and  inversely  proportional  to  so  that  areas  with  small  local 
variance  receive  larger  gain. 

To  prevent  the  gain  from  being  inordinately  large  in  areas  with  large  mean 
and  small  standard  deviation,  the  local  gain  is  actually  controlled  as  in 
Figure  12. 

Computing  the  local  area  mean  and  standard  deviation  in  Figure  11  is 
similar  to  spatial  filtering  because  the  local  mean  is  really  the  convolution 
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Figure  12.  Local  Area  Gain  vs.  — Curve  Used 
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of  a rect  function  with  the  image.  Convolution  with  the  rect  function  is 
really  low  pass  filtering  because  the  sync  function  is  a low  pass  filter. 
Hence  we  can  replace  the  local  average  function  with  an  equivalent 
recursive  low  pass  filter. 

The  local  standard  deviation  is  approximated  by  a similarly  low  passed 
version  of  the  absolute  difference  between  the  image  intensity  and  the 
local  mean  estimate  M. .,  Figure  13  is  a realization  of  the  LAGBC 
using  linear  recursive  low  pass  filters  to  estimate  the  local  mean  and 
standard  deviation. 
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Figure  13.  LAGBC  with  Linear  Recursive  Low  Pass  Filters 
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Realization  of  the  Recursive  Low  Pass  Filter- -As  demonstrated  above, 
the  two-dimensional  recursive  low  pass  filter  is  the  basic  building 
block  of  the  LAGBC  scheme. 


The  two-dimensional  separable  first-order  recursive  filter  has  a frequency 
response  given  by  the  product  of  the  two  one -dimensional  filter  responses: 

|H(f^.y|  - [1  +(f^/f/i^  [1 

f is  the  3 dB  cutoff  frequency  of  the  low  pass  filter, 
c 

The  equivalent  sampled  data  filter  has  a two-dimensional  Z transform. 


H(Zi,Z2)  = 


(1  - e“^Zj”^)(l  - e’Yzg 


where 

Y » 2jr  f It 
^ c s 

Changing  y changes  the  effective  size  of  the  local  area,  i.e. , the  area 
over  which  the  local  mean  averaging  is  done.  The  above  separable 
filter  can  be  realized  using  two  distinct  (but  functionally  equivalent) 
structures.  These  are  described  below. 
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1.  Nonseparable  implementation. 


In  the  Z domain,  let  the  output  be  Y(Zy  z^)  and  the  input  be  X(Zj,  z^). 
Then, 


Y(z 

xTz 


1 

1 


2 

Y 


(1  - e '^z^  S (1  - e '^z^ 


This  form,  implemented  directly,  gives  the  recursive  relation 
2 

y(m,n)  = ^y<m-l,n)+e  ^m,n-l) 

2v 

-e  'y(m-l,n-l) 

where  m is  the  row  number,  and  n is  the  column  number  in  the 
image. 

The  schematic  for  implementing  this  realization  on  real  time 
video  stream  is  shown  in  Figure  14.  We  see  that  the  line  delay 
and  the  two  pixel  delays  give  us  the  necessary  delays  to  perform 
the  filtering. 
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INPUT 

X(ni,n) 

SUMMER 

Figure  14.  Implementation  of  Two-dimensional  Recursive 
Low  Pass  Filter  (Approach  1) 

2, 

Separable  implementation. 

Define  a new  intermediate  variable 

-1, 

(1  - e ) 

Then, 

W; — TT  ’ W(z  Z2)  • 

^ ^ (1- 

1 

e"^z_  S 

Therefore, 


y<m,n)  = y v/{m,n)  + e ^m-l,n) 

and 

w(m,n)  = yx{m,n)  + e ^w(m,  n-1) 

Thus  we  break  up  the  two-dimensional  filter  into  two  one -dimensional 
filters  in  cascade.  Figure  15a  shows  this  filter  realized  in  the  above 
manner.  The  output  of  this  filter  is  exactly  equivalent  to  that  of 
Figure  14  because  the  transfer  function  is  separable.  This  realiza- 
tion has  ceitain  advantages  over  the  nonseparable  implementation, 
described  below. 

The  parameter  yis  in  the  range  0.  1 to  0.  3,  which  makes  e ^ 1 - y, 

in  the  range  0.9  to  0.  7.  In  the  nonseparable  realization,  the  smallest 
weight,  Y =0.01  to  O.'l.^s  much  smaller  than  the  largest  weight 
1 - Y = 0.  7 to  0.9.  The  coefficient  precision  needed  in  the  summing 
amplifier  in  Figure  14  is  therefore  very  stringent.  On  the  other  hand, 
coefficient  range  in  the  separable  formulation  is  much  smaller; 

Y = 0. 1 to  0.3  for  the  smallest  weight  and  1 - y = 0.9  to  0.  7 for  the 
largest.  Therefore,  the  separable  structure  is  much  less  susceptible 
to  charge  coupled  device  (CCD)  noise  and  amplifier  gain  variation 
than  is  the  corresponding  nonseparable  structure. 


The  separable  structure  has  the  additional  advantage  that  the  first  low 
pass  filter  (along  the  scan  direction)  can  be  easily  implemented  in  a 
passive  RC  first-order  circuit,  as  shown  in  Figure  I5b.  At  video 
frequencies,  the  required  resistor  (R)  and  capacitor  (C)  values  are 
very  reasonable.  This  eliminates  the  need  for  the  single  pixel  delay 
for  the  first  filter.  However,  the  second  low  pass  filter  in  the  vertical 
direction  is  still  a sampled  data  filter  as  before  (Figure  I5b). 

The  above  LAGBC  scheme,  using  the  separable  recursive  low  pass 
structure  and  CCD  line  delays,  has  already  been  breadboarded  and  is 
undergoing  testing.  Following  is  a description  of  this  hardware  and  a 
discussion  of  the  various  building  blocks  of  the  schematic. 

LAGBC  Breadboard  Description 

The  LAGBC  breadboard  implementation  block  diagram  is  shown  in 
Figure  16.  The  different  sections  are  outlined  in  dashed  lines  and 
numbered. 


In  Figure  16,  the  LAGBC  input  intensity  information  is  already  sync 
separated  and  scaled  from  "0"  (black)  to  "l"  (white)  volts*  by  the 
global  gain/bias  unit.  This  intensity  information  is  processed  by  the 
local  mean  two-dimensional  separable  low  pass  filter  in  Block  1.  As 
shown  in  the  detailed  disigram  of  this  filter  in  Figure  17,  filtering  is 
first  done  horizontally  by  a first-order  analog  RC  low  pass  filter  with 


*See  the  global  gain/bias  subsection  of  this  report.  Scaling  is  needed 
due  to  dynamic  range  of  the  LAGBC  circuitry. 
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Figure  16.  LAGBC  Implementation  Block  Diagram 


Implementation  Diagram 


cutoff  frequency  and  then  vertically  by  a CCD  recursive  low  pass 

filter  with  coefficient  y.  The  CCD  filter  is  implemented  with  high  bandwidth 

operational  amplifiers  and  half  of  a Fairchild  CCD  321A  dual  455 -sample 

CCD,  which  is  used  as  the  horizontal  line  delay  (IH).  The  values  of  obji 

and  Y are  potentiometer  adjustable.  The  CCD  input  is  AC  coupled  in 

the  feedback  loop  to  eliminate  the  need  to  bias  the  CCD  input  and  to 

reduce  the  sensitivity  of  the  loop  to  DC  offsets.  As  a result  of  this  AC 

coupling,  an  additional  low  pass  filter  is  added  to  recover  lost  low 

frequency  information  and  eliminate  the  transient  effects  of  having  a 

high  pass  filter  in  the  loop.  The  low  pass  filter  has  a cutoff  frequency 
u>c 

of  — , where  w is  the  AC  coupling  high  pass  filter  cutoff  frequency 
Y ^ 

(approximately  -1  Hz).  The  low  pass  output  is  fed  forward  to  summer 
E_  to  give  the  estimate  of  the  local  mean  M. .. 

>5  11 


In  Figure  16,  the  local  mean  M_  is  subtracted  from  the  input  intensity  1.^ 
by  an  op-amp  summer  in  Block  2.  An  op-amp  and  diode  network  in 
Block  3 takes  the  absolute  value  of  this  difference,  and  the  result  is 
two-dimensionally  filtered  in  Block  4 to  give  the  estimate  of  the  local 
standard  deviation,  o...  This  two-dimensional  filter  uses  the  other  half 
of  the  CCD  32 IM  unit  and  is  identical  to  the  filter  used  to  compute  M_, 
except  for  independent  adjustment  of  filter  coefficients. 


The  computation  of  the  local  gain  G..  involves  the  inversion  of  a...  If 
the  simulation  values  for  a,  G , and  G (see  Figure  12)  are  used, 
the  range  of  relevant  values  for  is  from  0.025  to  0.  25  V for  an 
input  (intensity)  range  of  0 to  1 V,  assuming  the  global  mean  replaces 
in  the  gain  equation.  To  simplify  the  circuitry  and  desensitize  the 
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gain  with  respect  to  the  small  values  of  a linear  approximation  to 

the  inversion,  shown  in  Block  5,  was  used.  The  slope  {a  K^)  and  bias 

{a  K^)  are  adjustable  (a  is  the  weighting  term  in  the  local  gain  equation), 

A single  op-amp  implements  this  inversion  circuit.  The  output  of  the 

inversion  circuit  is  peak  limited  between  G . and  G by  an 

min  max 

adjustable  threshold  diode  network  in  Block  6. 

The  local  gain  G^  is  multiplied  by  (1_  - with  a high  bandwidth  analog 
multiplier  in  Block  7 and  the  result  added  to  in  Block  8 by  an  op-amp 
summer  to  produce  an  enhanced  output.  The  signal  is  normalized  by 
adjusting  the  contrast  (AC  gain)  and  brightness  (DC  level)  in  Block  9.  A 
black  level  is  set  in  Block  10  by  using  an  analog  switch  during  blanking. 
The  signal  is  then  peak  limited  to  eliminate  "spikes"  and  overshoot  by 
a diode  threshold  pair  in  Block  11.  The  output  of  this  limiter  is  used 
by  the  target  screener.  To  display  the  signal  on  a video  monitor,  the 
composite  sync  signal  is  added  in  with  an  op-amp  summer  and  then 
buffered  with  a video  driver  in  Block  12.  The  resultant  output  is  a 
composite  video  signal  capable  of  driving  a 75n  load. 

LAGBC  Breadboard  Evaluation 

Preliminary  evaluation  of  the  LAGBC  breadboard  has  been  done.  The 
values  of  filter  coefficients,  gain  maximums  and  minimums,  etc. , from 
the  simulations  were  used  in  the  hardware.  The  images  tested  were 
from  data  furnished  by  NVL  using  a Hughes  FLIR.  The  data  from  a 
video  disk  recorder  were  processed  at  the  525-line  standard  television 
rate,  displayed  on  a video  monitor,  and  photographed. 
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The  input  to  the  LAGBC  was  first  automatically  scaled  between  0 and  1 V 
by  an  AGC  unit  using  the  frame  video  mean  and  standard  deviation;  this 
adjusted  the  contrast  and  brightness  on  a sync  separated  video  input  from 
a video  disk,  tape,  camera,  etc.  The  AGC  does  not  correct  saturated 
inputs  as  will  the  global  gain/bias  control  for  the  FUR,  but  it  does 
adjust  small  signals  to  the  range  needed  by  the  LAGBC. 

For  evaluation,  the  LAGBC  settings  were  preset  and  not  adjusted  for 
any  of  the  test  inputs.  Original  and  enhanced  test  photos  of  the  video 
monitor  output  are  shown  in  Figures  18  to  21.  It  can  be  seen  in  these 
photos  that  the  contrast  between  large  areas  is  decreased  while  small 
local  area  contrast  increases.  This  is  seen  by  comparing  Figure  18a 
and  18b,  and  looking  at  the  black  (smoky)  area  on  the  left  side.  In 
the  original  image  part  of  a target  can  be  seen,  but  in  the  enhanced  image 
there  are  two  obviously  better  defined  target  areas.  Similar  effects 
are  seen  in  the  other  photos. 

These  initial  tests  on  the  breadboard  show  that  the  LAGBC  concept  works 
in  relatively  simple  analog  hardware.  Tests  still  must  be  run  to 
optimize  filter  coefficients  and  other  adjustable  parameters  for  use  with 
the  MODFLIR. 

GLOBAL  GAIN/BIAS  CONTROL 

PATS  Global  Gain /Brightness  Control  Approach 

Thermal  imagers  such  as  the  MODFLIR  have  two  controls  that  need 
constant  readjustment  for  optimum  viewing:  1)  the  gain  control  that 
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Figure  22.  Functional  Schematic  of  Global  Gain  and  Brightness  Control 

controls  the  gain  of  the  post-amps  and  hence  the  sensitivity  to  temperature 
differences  in  the  scene;  and  2)  the  brightness  control  that  adds  a bias  to 
the  post-amp  output  (LED  driver)  stage  to  shift  the  temperature  range  of 
the  scene  up  and  down  to  matf  h the  luminance  rauige  of  a display  or  the 
storage  range  of  a digital  frame  buffer. 

Figure  1 shows  the  functional  diagram  for  the  global  gain /brightness 
control,  DC  restore  and  local  area  gain /brightness  control  for  the  ATS. 
The  LED/vidicon  Interface  has  a dynamic  range  of  approximately  40  dB, 
and  therefore  a global  control  of  the  gain  and  bias  is  incorporated  by 
feeding  the  appropriate  voltages  back  to  the  post-amps  and  LED  driver 
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cards,  to  insure  that  the  vidicon  output  will  never  be  saturated  and 
optimum  use  of  the  interface  dynamic  range  is  made.  If  a digital 
frame  buffer  with  eight  bits  of  resolution  is  used,  this  feedback  method 
will  still  be  needed  to  optimiz®  the  A/D  converter  dynamic  range  (50  dB). 
It  is  the  function  of  the  local  area  gain  control,  then,  to  expand  or 
contract  the  dynamic  range  of  the  local  areas  of  the  scene  to  the  full 
luminance  range  (~20  dB)  of  the  display.  In  this  manner,  the  entire 
10,000  : 1 temperature  dynamic  range  expected  at  the  detector  can  be 
handled  in  a completely  hands-off  mode  by  the  combination  of  the 
global  and  local  area  gain  and  brightness  controls  without  exceeding 
the  dynamic  range  of  any  system  component. 

Figure  22  shows  a functional  block  diagram  of  the  global  control 
algorithm  to  make  optimum  use  of  the  LED  vidicon  or  frame  buffer 
interface  dynamic  range.  The  number  of  exceedances  of  two  thresholds 
corresponding  to  the  upper  and  lower  limits*  on  the  vidicon  or  A/D 
output  are  counted  and  integrated  over  a field  time  (1/60  sec).  These 
counts  are  used  to  generate  the  gain  and  bias  increments  AG  and  AB 
which  are  used  to  change  the  gain  and  bias  voltages  fed  back  to  the 
FLIR  boards.  For  example,  if  the  upper  exceedances  are  significantly 
more  than  the  lower  exceedance  counts,  the  bias  increment  is  made 
negative,  and  vice  versa.  The  gain  increment  is  determined  in  a 
similar  fashion  and  would  be  a function,  for  example,  of  the  sum  of 
the  lower  and  upper  exceedance  counts.  No  change  will  be  made  to 
the  gain  and  bias  if  the  lower  and  upper  exceedance  counts  are  within 

. t 

■<  f 

Lower  (black)  limit  0,  and  upper  (white)  limit  1 V will  be  typical. 
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a prespecified  range— typically  0. 1 percent  or  lens  of  the  entire  frame. 

In  this  manner  isolated  noise  peaks  are  allowed  but  any  significant  satur- 
ation of  the  vidicon,  and  subsequent  display,  will  be  avoided.  Since  the 
scene  extrema  do  not  change  very  rapidly,  the  feedback  process  will  con- 
verge quickly  to  a stable  value. 

Global  Gain /Bias  Implementation 

A simple  implementation  diagram  of  the  scheme  featuring  a micro- 
processor system  is  shown  in  Figure  23.  The  input  signal  from  the 
vidicon  or  frame  buffer  is  thresholded  by  an  upper  and  lower  threshold 
comparator;  the  output  of  each  is  a logical  level.  The  comparator 
outputs  are  gated  by  a clock  that  runs  at  the  pixel  sampling  rate 
during  the  active  line  time  and  is  off  during  blanking.  Two  binary 
counters  are  clocked  by  the  number  of  pixels  that  the  threshold  is 
exceeded,  giving  a count  of  the  number  of  exceedances  per  frame.  The 
most  significant  bits  of  the  counters  are  examined  by  the  microprocessor 
system  to  make  the  computation  for  the  gain  (G)  and  bias  (B)  to  the  FLIR. 
The  values  of  G and  B are  converted  to  analog  format  and  buffered  to 
drive  the  FLIR  control  inputs. 

The  microprocessor  specified  for  this  application  is  the  single-chip  Intel 
8748  with  1 K of  internal  program  memory.  Most  instructions  are 
executed  in  2.5  to  S.Opsec,  meaning  that  200  to  400  instructions  can  be 
executed  in  the  1 msec  vertical  retrace  time.  If  necessary,  computation 
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can  overlap  into  the  actual  trace  time.  A complex  and  flexible  program 
can  be  placed  in  the  1 K of  the  Intel  8748  program  storage,  allowing  for 
numerous  algorithm  implementations  to  be  tried  once  the  FLIR  has 
arrived. 

For  initial  development,  the  Intel  8080  MDS  simulator  will  be  used  for 
algorithm  testing.  The  8748  and  8080  have  very  similar  instructions 
sets  and  timing.  Once  operational  on  the  MDS  system,  the  algorithm 
will  be  programmed  into  the  8748  for  installation  in  the  final  hardware. 


SECTION  III 
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TARGET  SCREENING 

This  section  covers  the  progress  made  in  this  reporting  period  on  the 
following  target  screening  tasks: 

• Data  preparation 

• Image  segmentation 

I 

• Feature  extraction 

\ SECTION  SUMMARY 

A total  of  260  FLIR  image  frames  containing  tanks,  trucks,  and  armored 
personnel  carriers  (APCs)  have  been  digitized  to  date  and  annotated  in 
the  data  preparation  task.  They  have  been  debanded  where  necessary. 
Target  positions  and  types  have  been  identified  and  recorded  on  the 
digital  tape  header. 

The  image  segmentation  part  of  the  simulation  software  is  now  fully 
operational,  with  a number  of  improvements  to  the  present  Augmented 
j Target  Screener  system  (ATSS)  autothreshold,  backgroxmd  estimate, 

, and  the  object  interval  generation  criterion  to  make  the  segmentation 

• more  robust.  Feature  extraction  software  has  been  developed  to 

extract  moment  features  on  object  segments  as  well  as  Fourier 
descriptors  (FDs)  of  the  object  boundary  extracted  by  the  segmenter. 
Analysis  software  has  been  developed  to  plot  and  analyze  the 
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discriminatory  powers  of  these  features.  The  test  images  are  being  run 
through  the  simulation  to  evaluate  the  recognition  features.  Following 
is  a more  detailed  report  of  the  progress  on  these  tasks. 

DATA  PREPARATION 

The  five  classes  of  targets  of  interest  to  PATS  are: 

• Tank 

• APC 

• 2-1/2  ton  truck 

• Tracked  missile  launcher 

• Tracked  anti-aircraft  cannon 

Training  data  for  statistical  classifier  design  are  available  to  a limited 
extent  on  the  first  three  classes,  tank,  ApC,  and  truck.  Targets 
have  strong  elevation  and  aspect  angle  dependence,  as  we  see  in 
Figures  24a  and  b.  Therefore  we  not  only  have  five  classes  of  targets  to 
discriminate,  but  each  target  class  has  subclasses  that  depend  on  the 
target  aspect  and  elevation.  That  Is  why,  when  extracting  the  recognition 
features  and  designing  the  subsequent  classifier,  we  have  to  keep  the 
target  aspect,  elevation,  and  type  identities  separate.  Also,  we  need 
training  data  that  statistically  represent  all  the  aspect  and  elevation 
angle  combinations  expected. 
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It  is  easy  to  see  that  shape  features  extracted  cannot  be  made  aspect 
independent  (unless  we  were  looking  straight  down,  in  which  case 
rotation  invariance  could  imply  aspect  invariance).  But  range 
invariance,  up  to  a point,  is  easier  because  it  merely  involves 
scaling.  As  for  elevation,  we  are  interested  in  low  elevation  angles 
for  the  "pop-up"  mission  scenarios  expected  in  the  Advanced  Attack 
Helicopter  (AAH)  program. 

The  sensor  altitude  in  these  missions  is  small  (treetop,  < 20  m),  and 
meaningful  ranges  (>  200  m)  imply  very  small  elevation  angles  (<  10 
degrees).  Therefore,  in  the  classifier  design,  we  intend  to  fix  the 
elevation  angle  at  approximately  0 to  10  degrees,  which  simplifies 
the  problem  somewhat. 

IMAGERY  SOURCES 

We  now  have  two  sources  of  video  taped  525 -line  imagery:  the  Honeywell 
FLIR  imagery  taken  by  Marge  Krebs  of  Honeywell,  and  the  new  serial 
scan  FLIR  data  supplied  by  NVL.  The  Krebs  data  consist  of  imagery 
acquired  from  a light  aircraft  of  tanks,  APCs,  and  2-1/2  trucks,  at  sm 
altitude  of  approximately  3,000  feet  at  ranges  of  10  miles  to  flyover. 
Because  of  this  high  altitude,  all  frames  at  useful  slant  ranges  tend  to 
have  high  elevation  angles  (typically  > 25  degrees).  The  target  aspects 
, represented  ara  primarily  side  and  oblique  side.  Very  few  front  and 

rear  aspects  are  present  in  this  test  set. 
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The  NVL  data,  on  the  other  hand,  were  acquired  from  a low  altitude  plat- 
form FLIR  and  therefore  the  elevation  angles  are  near  zero.  The  target 
classes  here  are  predominantly  tanks  with  a fair  representation  of  APCs, 
Multiple  targets  are  present  and  therefore  this  represents  a good  test 
base.  All  aspects  and  ranges  are  adequately  represented.  However, 
there  are  no  2-1/2  ton  trucks  in  this  imagery. 

We  have  digitized  a total  of  160  frames  from  the  Krebs  video  tape  of 
medium  to  far  range  trucks,  tanks,  and  APCs.  These  data  consist  of 
elevation  angles  between  30  to  45  degrees  and  target  aspects  of  mostly 
side  and  slant  side.  The  FLIR  with  which  this  data  was  acquired  had 
two  c.hannels  with  different  coupling  capacitor  constants.  As  a 
consequence,  one  channel  exhibits  a more  severe  droop  than  the 
other  along  a scan  line.  This  resulted  in  severe  banding  of  the 
imagery  near  the  right  of  the  frame.  Figures  25a  and  b show  examples 
of  this  banding.  This  banding  can  play  havoc  in  the  segmentation  stage 
by  causing  spurious  vertical  edges.  Therefore,  we  devised  a statistical 
debanding  algorithm  that  corrects  narrow  vertical  strips  of  the 
digitized  image  by  estimating  the  average  bias  of  one  channel  with 
respect  to  the  other  in  that  strip.  This  takes  care  of  the  variation 
of  the  bias  level  along  the  scan  direction  and  successfully  debands  the 
digitized  images.  Additional  logic  is  incorporated  to  account  for  the 
saturated  regions  in  the  image  (which  were  not  banded  before  the 
debanding  process).  Figures  26a  and  b are  the  debanded  versions  of 
the  images  in  Figures  25a  and  b.  We  see  that  the  debanding  algorithms 
successfully  restore  the  differences  in  the  two  channels  without 
breaking  up  saturated  targets.  Figure  27  is  a sample  sheet  containing 
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a b 

Figure  25.  Examples  of  Banding  Due  to  Differences  in  the  Two  Channel 
Bias  Variations  in  the  Krebs  FLIR  Data  Base 


a b 

Figure  26.  Result  of  Applying  the  Debanding  Algorithm  to  the  Frames  in 
Figure  25 
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Figure  27.  Examples  of  FLIR  Image  Frames  Digitized  from  the  Honeywell 
FLIR  Tape  for  the  PATS  Training  Data  Base 


some  of  the  160  subframes  that  have  been  digitized  and  debanded  for  the 
PATS  simulation  effort. 

The  NVL  imagery  does  not  suffer  from  the  banding.  Approximately  100 
frames  of  this  imagery  have  been  digitized.  Figure  28  shows  some 
examples  of  the  digitized  frames  from  this  tape. 

Tables  1 and  2 show  the  proportion  of  tanks,  trucks,  suid  APCs  at  various 
aspects  among  the  two  sets  of  data  digitized  so  far.  We  note  that  there 
is  a preponderance  of  tanks  in  both  sets.  More  frames  are  being 
digitized  from  the  new  NVL  video  tape  to  complete  representation  of 
all  aspect  angles  of  tanks  and  APCs. 

The  frames  we  have  digitized  so  far  are  "independent"  in  that  they  are 
often  more  than  a second  apart  and  taken  during  different  passes. 
Therefore,  we  are  also  digitizing  short  sequences  of  frames  (10  to  15 
frames)  to  use  in  testing  the  interframe  analysis  portion  of  the 
simulation. 

As  each  frame  is  digitized,  annotative  information  is  written  into  the 
magnetic  tape  header.  This  information  includes  for  each  target  in  the 
frame  the  target  type,  aspect,  size  (width  and  height  in  pixels),  and  its 
position  in  the  image  file.  This  header  information  and  the  hardcopy 
film  transparencies  (reproduced  in  Figures  27  and  28)  are  indispensible 
for  feature  analysis,  evaluating  the  image  segmentation,  and  supplying 
the  "ground  truth"  in  training  the  classifiers. 


55 


'e  28.  Examples  of  FLIR  Frames  Digitized  from  the  NVL  Video  Tape 
for  the  PATS  Training  Data  Base 
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TABLE  1.  PROPORTION  OF  TARGET  TYPES  AND  ASPECTS  IN  KREBS 
DATA  (FROM  THE  HONEYWELL  FLIR)  DIGITIZED  TO  DATE 
(Average  elevation  angle  = 30°) 


Aspect  Angle 

Target 

Type 

0 

45 

90 

135 

180 

215 

2?0 

315 

Total 

Tank 

12 

0 

5 

2 

32 

0 

0 

0 

51 

APC 

15 

0 

2 

1 

16 

0 

2 

6 

42 

T ruck 

21 

0 

2 

0 

23 

0 

16 

0 

62 

Jeep 

1 

0 

5 

2 

4 

0 

0 

4 

16 

TABLE  2,  PROPORTION  OF  TARGET  TYPES  AND  ASPECTS  IN  THE  NVL 
FLIR  IMAGERY  DIGITIZED  TO  DATE  (Average  elevation  =“  0 
--Ground  mounted  FLIR) 


Aspect  Angle 

Target 

Type 

0 

45 

90 

135 

180 

215 

270 

315 

Total 

Tank 

20 

3 

13 

3 

13 

12 

17 

2 

83 

APC 

3 

4 

2 

0 

0 

3 

0 

15 

Truck 

0 

0 

0 

0 

B 

0 

0 

0 

Jeep 

2 

0 

1 

0 

2 

0 

0 

0 

5 

5 
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The  digitizer  at  the  Honeywell  facility  digitizes  along  columns  of  the  image. 
As  a consequence,  each  image  row  on  the  magnetic  tape  corresponds  to  a 
column  of  the  displayed  image  (perpt.if'icular  to  the  scan  direction).  There- 
fore, we  transpose  the  digitized  image  files  so  that  the  rows  in  the  image 
file  correspond  to  the  scan  lines  in  the  original  image.  This  enables  a 
more  faithful  simulation  of  the  PATS  hardware  concepts. 

IMAGE  SEGMENTATION 

P'igure  29  is  an  overview  of  the  target  screening  functions  involved  in 
PATS,  The  function  of  image  segmentation  is  to  extract  simple  regions 
from  the  image  that  are  characteristic  of  targets.  In  this  subsection  we 
describe  the  results  of  our  simulation  of  the  segmentation  process  and 
then  discuss  the  sensitivity  of  the  segmentation  to  the  parameter  changes 
in  the  algorithm.  The  goal  of  the  simulation  is  to  render  the  first  level 
segmentation  (candidate  target  extraction)  robust,  so  that  reliable 
recognition  features  can  be  extracted  from  the  segmentation. 

Autothreshold 

As  we  saw  in  Figure  29,  we  need  to  extract  the  "object  intervals"  along 
each  scan  line;  these  are  then  concatenated  in  the  bins  to  complete  the 
representation  of  each  candidate  object.  This  enables  purely  sequential 
real  time  operation  on  the  serial  video.  These  object  intervals  are 
generated  by  coincidence  of  "brlghts"  and  "edges"  along  a scan  line. 

Edges  are  thresholded  binary  outputs  from  a two-dimensional  edge 
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operator  and  the  brights  are  also  thresholded  binary  outputs  obtained  by 
thresholding  the  input  image.  This  threshold  has  to  be  scene  adaptive  and 
local  (not  global),  hence  the  name  autothreshold.  With  the  present  auto- 
threshold, these  adaptive  and  local  properties  for  generating  the  binary 
brights  are  realized  by  the  use  of  the  background  estimator  (see  Figure  30). 
The  background  estimator  is  an  adpative  two-dimensional  recursive  filter 
that  yields  the  local  background  average  at  every  point  in  the  image.  This 
background  is  subtracted  from  the  input  video  and  the  result  is  thresholded 
to  get  the  brights.  The  advantage  of  the  autothreshold  for  brights  is 
that  only  departure  from  the  local  background  is  considered.  Thus, 
slowly  varying  background  will  not  be  thresholded  into  false  target 
contours,  as  when  a constant  global  threshold  is  employed.  Figure  30 
shows  two  bright  definitions  being  experimented  with.  One  is  a low 
contrast  "absolute  bright,  " and  the  other  is  a higher  contrast  "hot" 
bright.  The  former  thresholds  both  positive  and  negative  departures 
from  the  background,  while  the  latter  thresholds  only  the  hotter  areas 
from  the  background.  We  will  defer  discussion  of  these  thresholds 
to  a later  subsection. 

Background  Estimator 

The  backgroimd  estimator  shown  in  Figure  31  is  identical  in  structure 
to  the  two-dimensional  recursive  low  pass  filter  used  in  the  local  area 
gain /brightness  control  (Section  II)  with  the  exception  of  the  switch  SWl 
that  comes  into  play  when  the  background  estimator  hits  a target  area. 

We  do  not  want  the  hot/cold  target  areas  to  affect  the  background 
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Figure  31.  Background  Estimator 


estimate.  Therefore,  the  presence  of  an  object  interval  (the  binary  FI) 
at  the  corresponding  point  on  the  previous  scan  line  opens  the  switch  SWl, 
and  the  horizontal  low  pass  filter  holds  its  output  at  a constant  value  until 
the  object  ceases.  The  switch  then  closes  and  normal  background  updating 
resumes.  Figure  32  illustrates  this  for  an  artificial  example.  In  the  ab- 
sence of  the  switch,  for  a sufficiently  large  target,  the  output  of  the 
horizontal  filter  rises  and  falls  off  gradually  as  shown.  This  tends  to 
result  in  loss  of  the  target  "brights"  at  the  right  and  bottom  of  the 
target,  and  creation  of  artificial  "cold"  stretches  after  the  target.  We 
show  the  utility  of  this  switch  in  Figure  33a,  b and  c.  Figure  33a  is  the 
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Figure  32.  The  FI  Controlled  Switch  Helping  in 
Backgrotind  Estiination 

original  FLIR  frame  containing  a hot  truck.  Figure  33b  is  the  background 
estimate  with  the  switch.  We  note  that  the  background  is  faithfully 
reproduced  in  Figure  33c.  In  Figure  33b  we  see  that  the  target  causes 
the  estimated  backgroimd  to  rise  with  it,  which  is  undesirable  in  a 
background  estimate. 
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a.  FLIR  frame 


b.  Background  estimate  without  switch  c.  Background  estimate  with  switch 
Figure  33.  The  Effect  of  the  Switch  in  the  Background  Estimation 
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Sensitivity  to  Parameter  B 


The  background  filter  has  the  parameter  0 (0  < 3 < 1)  and  the  time  constanf 
HC,  which  need  to  be  tuned  for  best  performance.'!'  Changing  g varies 
the  apparent  size  of  the  local  area  over  which  the  background  averaging  is 
done.  Note  that  3 is  the  feedback  coefficient  in  the  vertical  recursive 
filter.  The  closer  g is  to  unity,  the  larger  the  local  area  will  be.  We 
would  like  g to  be  large  so  that  small  local  variations  do  not  affect  the 
background  estimate.  But  making  g too  close  to  unity  makes  the  filter 
susceptible  to  CCD  noise  (generated  in  the  line  delay).  We  experimented 
with  g = 0.8,  0.9  and  0.95, 

Figure  34  is  the  original  FLIR  frame,  and  we  reproduce  the  brights 
obtained  using  g = 0,8,  0.9  and  0.95  in  Figure  34b,  c,  and  d respectively. 
Since  g ■ 0,9  can  be  realized  without  undue  CCD  noise'!®!',  we  chose  g = 0.9. 

Threshold  Selection 

The  brights  are  obtained  by  thresholding  the  difference  between  the 
image  and  the  local  background  estimate.  We  illustrate  this  in  Figure  35 
with  the  intensity  profile  of  an  actual  thermal  image.  The  usefulness 
of  the  background  estimate  is  obvious  from  this  example.  Without  it, 
a global  threshold  would  have  difficulty  extracting  the  target  without 
getting  large  chunks  of  background  at  the  left  of  the  profile. 

For  reasons  of  symmetry,  the  horizontal  filter  time  constant  RC  is 
chosen  so  that  it  matches  the  vertical  filter  impulse  response. 

>>  * 

The  vertical  recursive  filters  in  the  local  area  gain/brightness  bread- 
board use  a g of  0. 9. 
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a = 0.8 

e » 0.8 
C « 3.0 
K - 20 

Figure  34a.  Brights  Obtained  by  the  Autothreshold  with  P *0.80  (No  Switch) 
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a = 0.9 
6 - 0.9 
C = 3.0 
K - 40 


Figure  34b.  Brights  Obtained  by  the  Autothreshold  with  B » 0,90  (No  Switch) 
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C - 3.0 
K - 50 

Figure  34c.  Brights  Obtained  by  the  Autothreshold  with  0 * 0.95  (No  Switch) 


Figure  35.  An  Actual -Intensity  Profile  Across  a FLIK  Image  Cross 
Section,  (The  background  estimate  and  the  threshold 
contours  are  shown  in  broken  lines.) 

Now,  let  the  image  intensity  at  point  (x,  y)  be  I(x,  y),  and  the  correspond- 
ing background  estimate  at  (x,  y)  be  b(x,  y).  Then  B(x,  y)  = I (a  bright), 
if 

I(x,y)  - b(x,  y)  > T^j  (Hot  threshold) 

or 

|l(x, y)  - b(x, y)|>  Tj^  (Low  contrast  threshold) 

both  the  threshold  definitions  are  being  tested  independently. 
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► 


In  the  first  definition,  we  get  only  areas  that  are  considerably  hotter  than 
the  background.  In  the  second  we  get  low  contrast  differences  (both 
hotter  and  colder  than  the  background). 

The  thresholds  T(T„  and  T ) should  be  scene  adaptive.  Obviously,  they 

n Li 

should  be  a function  of  the  scene  variance  around  the  background  estimate, 
i.e., 

T = C-~  E I I(x,y)  - b{x,y)  | 

where  C is  a constant  of  proportionality  and  the  mean  absolute  difference 
approximates  the  standard  deviation.*  This  standard  deviation  can  be 
computed  either  globally  over  the  whole  frame,  or  locally  (for  example) 
over  the  previous  scan  line.  We  experimented  with  both  definitions  and 
found  that  the  global  estimate  gave  more  robust  results.  Figure  36a  and  b 
are  brights  obtained  using  the  local  and  global  definitions  respectively  for 
C = 2.  The  corresponding  results  for  C = 3 are  shown  in  Figure  36c  and  d. 
Note  that  in  order  to  implement  the  global  definition  of  the  threshold,  we 
have  to  have  the  background  variance  over  the  whole  frame.  This  can  be 
achieved  by  using  the  variance  estimate  from  the  previous  frame  as  the 
threshold  on  the  current  frame  while  the  new  variance  is  being  computed 
over  the  current  frame. 


It  can  be  shown  that,  for  the  Gaussian  distribution,  the  mean  absolute 
difference  ■VF'’-  where  a Is  the  standard  deviation. 
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a.  Bright  image  local  threshold  C = 2.  0 b.  Bright  image  global  threshold  C = 2.  0 
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c.  Bright  image  local  threshold  C = 3.0  d.  Bright  image  global  threshold  C * 3.0 

Figure  36.  Comparison  of  Local  and  Global  Bright  Thresholds 
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Edge  Thresholds 

The  two-dimensional  edge  operator  we  incorporated  in  our  simulation  is 
similar  to  the  Sobel  edge  operator  but  was  designed  to  operate  in  a noisier 
environment.  It  offers  some  built-in  image  noise  smoothing.  The  mask 
for  this  edge  filter  is  shown  in  Figure  37.  We  see  that  the  edge  has  two 
components,  the  horizontal  and  the  vertical.  The  mask  is  wider  in  the 
horizontal  direction  than  its  height.  The  reason  is  that  longer  vertical 
window  dimensions  are  harder  to  realize  (because  more  CCD  delay  lines 
are  needed)  than  the  horizontal  (which  have  simple  CCD  trainsversal 
filters  or  tapped  CCD  delay  lines). 


The  horizontal  component  of  the  edge  is  simply  the  absolute  difference  of  the 
average  of  the  nine  pixels  to  the  left  and  right  of  the  current  point.  The 


72 


vertical  component  is  similarly  defined  in  Figure  37.  The  total  edge 
is  then  the  sum  of  the  two  components . 


This  edge  is  thresholded  to  get  the  binary  output  in  a manner  similar  to 
the  bright  threshold.  Here  the  threshold  is  a function  of  the  mean  edge 
values  1)  over  the  previous  line,  or  2)  over  the  previous  frame 
(T  = C X mean  edge).  Again,  a comparison  of  the  two  showed  that  the 
global  definition  was  more  robust.  We  see  this  in  Figure  38a,  b,  c,  and  d 
with  the  local  and  global  definitions  for  C - 2 and  C = 3.  When  the  edge 
threshold  is  a function  of  the  edges  over  the  previous  scan  line,  the 
presence  of  a very  "edgy"  scan  line  can  cause  the  threshold  to  be  raised, 
resulting  in  missed  binary  edges  on  the  next  scan  line. 

Object  Interval  Extraction 

The  object  intervals  are  extracted  on  a scan  line  basis  using  the  thresholded 
binary  brights  and  edges.  Figure  39  conceptually  shows  the  interval 
generation  on  the  current  ATSS  hardware.  The  interval  is  started  by  the 
coincidence  of  an  edge  and  the  bright.  It  is  stopped  when  both  the  bright 
and  edge  cease.  This  lack  of  symmetry  in  starting  and  stopping  the 
object  intervals  exists  because  trailing  edges  are  often  feeble  in  FLIP 
imagery. 


► 


c.  Local  C = 3.0  d.  Global  C = 3.0 

Figure  38.  Comparison  of  Global  and  Local  Edge  Thresholds 
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Figure  39.  Object  Interval  Generation  Concept 
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The  current  turn  on  criterion  on  the  present  ATSS  hardware  is  as  follows: 


Turn  on  interval  at  point  n if 
M 

^ edge  (n  + k),  AND,  bright  (n  + k)  > N 
k =»  1 

(i.e.,  turn  on  if  N out  of  M coincidences  of  edges  and  brights  exist  at  n) 

This  turn  on  often  fails  because  either 

1.  Edges  and  brights  do  not  exactly  coincide,  that  is,  edge  precedes 
bright  (as  when  a bright  threshold  is  too  high). 

2.  Even  when  the  center  of  edge  matches  the  bright  (i.  e. , zero 
phase)  only  one-half  of  the  edge  points  are  being  used  to  start 
the  interval  (see  Figure  39). 

3.  The  edge  is  sometimes  missing  or  feeble  (less  than  N wide)  on  a 
scan  line  although  edges  may  exist  on  adjoining  scan  lines. 

Therefore,  a new,  more  forgiving  turn  on  criterion  was  programmed  into 
the  simulation  for  evaluation. 

The  new  criterion  is  illustrated  in  Figure  40.  This  was  designed  to  look 
for  the  existence  of  edges  on  either  side  of  the  current  scan  line  and  for 
the  presence  of  edges  oriented  vertically.  This  criterion  has  also  been 
simulated  and  has  proved  very  robust  in  detecting  the  object  intervals. 
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Alternate  Turn  on  Criterion  for  Object  Interval  Ebctraction 


i# 

I 

I 

We  can  see  this  in  Figure  41.  Figure  41a  and  Figure  41b  are  the  brights 
and  edges  produced  on  a test  image.  Figures  41c  and  d are  the  intervals 
obtained  by  the  two  turn  on  criteria  we  have  discussed  so  far.  We  note 
that  the  new  criterion  is  much  more  reliable  in  segmenting  the  image. 

Bin  Generation  and  Elementary  Feature  Extraction 

The  bin  selection  program  accumulates  the  object  intervals  and  records 
the  outline  of  the  candidate  object  from  the  object  intervals.  The  flow- 
chart of  this  logic  is  reproduced  in  Figure  42,  Very  simply,  each  bin 
accumulates  object  intervals  on  successive  scan  lines  that  fall  within 
the  midpoints  of  the  previous  interval  assigned  to  the  bin.  Missed 
intervals  are  filled  in,  up  to  three  lines.  If  there  are  no  new  intervals 
for  scan  lines,  the  bin  is  considered  closed  and  further  processing  is 
done  on  the  bin  (such  as  computing  the  various  elementary  features). 

Input  to  the  bin  program  are  the  following  list  of  quantities; 

1.  Object  interval  (scan  line  number,  beginning  countervalue, 
width  of  interval,  edge  count,  bright  count); 

2.  Average  background  near  interval  (i.  e, , the  background  filter 
sampled  at  beginning  of  interval); 

) 

4 

3.  Absolute  intensity  average  over  interval  (integrated  input 
over  the  object  interval);  and 

4.  Peak  intensity  over  interval. 
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c.  Intervals  with  present  ATSS  criterion  d.  Intervals  with  alternate  criterion 
Figure  41.  Two  Alternate  Interval  Extraction  Criteria 
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UHEN  THE  END  OF  A SCAN  LINE  IS 
REACHED,  THE  SCAN  LINE  NUMBER  Of 
THE  LAST  INTERVAL  IN  EACH  BIN  IS 
COMPARED  WITH  THE  NUMBER  ON  THAT 
SCAN  LINE  JUST  ENDED.  IF  THE 
DIFFERENCE  EXCEEDS  A SET  THRESHOLD 
THE  BIN  IS  CLOSED. 
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MIDPOINT  OF  INTERVAL  LIES 
BETMEEN  END  POINTS  Of  LAST 
INTERVAL  IN  BIN  OR  MIDPOINT 
OF  LAST  INTERVAL  IN  BIN  LIES 
BETMEEN  ENDPOINTS  OF  CURRENT 
INTERVAL. 


UPDATE  BIN;  ADO  TO  LENT.TH  AND  AREA, 

INCREASE  # OF  EDGES  AND  BRIGHTS. 

UPDATE  PEAK  IN  MINIMUM  INTENSITT, 

UPDATE  SUM  FOR  AVERAGE  INTENSITIES, 

ETC. 

Figure  42.  Bin  Generation  Flowchart 
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Items  2,  3,  and  4 are  new  to  the  present  ATSS  hardware.  Their  purpose 
is  to  introduce  intensity  (both  absolute  and  contrast)  information  into 
the  decision  making  process.  These  values  are  further  integrated  in  the 
bln  outputs.  The  output  of  the  bins  are  as  follows; 

1.  The  outline  of  the  bins  (left  and  right  coordinates  of  all  intervals) 
after  median  filtering  (see  below); 

2.  Total  length  of  target  (total  number  of  intervals)  including 

missed  intervals;  ^ 

3.  Active  length  (n),  not  including  missed  Intervals; 

4.  Average  width  — rW,; 

n I 

5.  Area  (sum  of  active  intervals  W^); 

6.  Edge  straightness  (S) 

where  b and  e are  the  beginning  and  end  of  coordinates  of  each 
interval: 

7.  Edge  discontinuity 

1 

|w,  -Wj_,|>3 


E.^  r IWj-Wj. 
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8.  Edge  count--sum  of  all  edge  counts/n; 

9.  Bright  count--sum  of  all  bright  counts/Area; 

10.  Average  backgroimd — average  of  all  interval  background  values; 

11,  Average  target  intensity 

r T.  • W. /Active  Area 
1 1 

i 

where  T.  is  the  average  interval  intensity,  W.  is  the  interval 
I ‘ ‘ 

. width. 

12  Peak  target  intensity  max  (P.),  where  P^  is  the  peak  intensity  over 
interval  i;  * 

13.  Average  width/length  ratio; 

14.  Mean  square  error  of  left  and  right  edge  fits  to  stiaight  lines; 

15.  Slopes  of  the  linear  least  squares  fit  to  the  left  and  right  edges 
respectively. 

[• 

! . 

1 
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Smoothing  of  Object  Boundaries 

The  extracted  object  outline  is  usually  far  from  perfect.  There  are 
missed  intervals  and  also  intervals  that  are  too  long  to  belong  to  the 
outline.  The  end  points  of  the  bin  are  therefore  subjected  to  a form 
of  boundary  smoothing.  Rather  than  linear  smoothing  that  rounds  off 
sharp  corners,  we  are  experimenting  with  median  filtering  of  these 
boundaries.  Instead  of  averaging  the  end  points  over  a window  three  pixels 
wide,  each  end  point  (left  or  right)  is  replaced  by  the  median 
(middle  value)  of  three  end  point  coordinates,  including  its  two 
neighbors  and  itself.  This  gets  rid  of  excessively  long  lines  and  very 
short  lines  flanked  by  "normal"  end  points.  Figures  43a  and  b are 
the  intervals  generated  and  the  actual  bin  outlines  obtained  for  a FLIR 
image  processed  through  the  simulation.  Note  the  end  point  achieved 
by  this  filtering  and  by  the  bin  generation  logic. 

RECOGNITION  FEATURES 


Two  classes  of  recognition  features  are  being  evaluated.  They  are  the 
moment  features  and  the  Fourier  boundary  descriptors  (FED). 

Moment  Features 

Two-dimensional  moment  features  have  been  used  in  character 

4 5 

recognition  and  aircraft  identification  . These  are  intensity  moments, 

4 

M.K.  Hu,  "Visual  Pattern  Recognition  by  Moment  Invariants,"  IRE 
Transactions  on  Information  Theory,  pp.  179-187,  February  1962. 

5 

S.A.  Dudani,  et.  al. , "Aircraft  Identification  by  Moment  Invariants,  " 
IEEE  Transactions  on  Computers,  pp.  39-46,  January  1977. 
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Figure  43a,  Object  Intervals  Extracted 
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i 


Figure  43b.  Boundary  of  Objects  Traced  by  the  Bin 
(After  Smoothing  the  Edges) 
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silhouette  (area),  and  boundary  moments  (binary).  We  are  evaluating  all 
three  classes  as  candidates. 


th 


Let  Km,  n)  be  the  object  intensity.  The  (p,  q)  moment  given  by 


m = ^ I Z I(m,  n)  m^n^ 
pq  N 


m n 


where 

N = number  of  points  in  summation 

If  the  scale  changes  by  a,  for  example,  i.e. , m'  » am,  n'  = cm,  then  we 
can  show  that 


p+q 

m'  = a m 


pq 


pq 


(1) 


We  are  interested  in  central  moments  p , which  are  the  above  moments 

_ _ pq 

m evaluated  around  (x,  y)as  the  origin 

pq 


where 


y “ '"oi'“'oo 


It  is  clear  that  u can  be  expressed  in  terms  of  m and  (x,  y). 

pq  pq 
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Invariance  to  Size--From  Equation  (1),  we  can  see  that 


pq  pq 


Consider 


pq 


pq 


pq 


£4q 

(u ' + u ' ) 2 

02  ^ 20 


p4q  p4q 

^^02  ''’*^20^  ^ ^^02  '*’^20^  ^ 


which  is  invariant  to  scale  a. 

2 

Now,  r » Pq2  ■^*^20  called  the  radius  of  gyration.  Therefore,  we 

normalize  all  u by  p I which  makes  them  invariant  to  size. 

pq  pq 

There  are  three  different  kinds  of  moments  we  are  exploring: 

1.  Intensity  moments  (as  above  with  I(x,  y)  representing  the 
target  intensities); 

2.  Silhouette  moments  (I(x,y)  = 1 over  all  points  within  target 
boundary);  and 

3.  Boundary  moments  (I(x, y)  » 1 on  boundary;  0 elsewhere). 
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For  each  class,  we  compute  all  moments  up  to  and  including  third  order 
moments,  i.e,. 


^00’  ^11’  ^20’  *^02'  *^21*  ^03’  ^30’  ^12 
Note  that  ^ ^01  * ^ because  these  are  central  moments, 

1.  Intensity  Moments 

Only  the  target  intensities  I(m,  n),  which  are  bounded  by  the 
outline  extracted  by  the  autothreshold  (bin  intervals),  are 
used. 

As  mentioned  before,  all  p are  normalized  by  for  size 

PQ  j 

invariance.  They  are  also  divided  by  m^^  “ ^ ^ El(m,n)  to 
normalize  them  by  the  average  target  intensity, 

2,  Silhouette  Moments 


We  do  not  need  the  original  image  intensities  for  the  silhouette 
moments. 


1 


E E m^  n**,  because  I(m,n)  = 1 inside  object, 
m n 0 elsewhere 


where  (m,  n)  are  all  points  contained  inside  the  boundary 
extracted. 
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Fourier  Boundary  Descriptors 


Since  human  target  recognition  is  often  done  using  the  target  outline 
(boundary)  in  FUR  imagery,  we  are  evaluating  boundary  shape 
descriptors  as  potential  recognition  features.  The  boundary  shape 
descriptors  are  derived  from  the  object  boundary  extracted  by  the  bin 
generation  program  in  the  form  of  the  beginning  and  end  coordinates 
of  successive  intervals  in  the  bin.  The  approach  we  are  taking  is 
similar  to  Zahn,  The  object  boundary  is  first  encoded  in  a periodic 
angle  versus  arc  length  waveform.  The  one-dimensional  discrete 
Fourier  transform  (DFT)  of  this  waveform  is  then  taken.  The  first 
few  amplitude  and  phase  coefficients  are  used  as  shape  discriminators. 

If  these  Fourier  boundary  descriptors  (FBD)  turn  out  to  be  useful 
shape  discriminators  in  the  PATS  simulation,  their  implementation 
via  the  new  CCD  Fast  Fourier  Transform  (FFT)  module*  is  deemed 
simple.  Following  is  a brief  description  of  this  process  in  the  current 
simulation. 

Given  the  object  intervals  as  in  Figure  44a,  we  have  to  encode  the 
outline  in  a simple  e(x)  form  (e  is  the  angle  with  respect  to  the  X axis; 

I is  the  arc  length  from  an  origin).  In  a discrete  situation,  as  here, 
we  have  to  assume  that  the  arc  lengths  are  not  all  equal.  Let  the  object 

®C.T.  Zahn  and  R.Z.  Roskies,  "Fourier  Descriptors  for  Plane  Closed 
Curves.  " IEEE  Transactions  on  Computers. . pp.  269-281,  March  1972. 

*Reticon  5601  CCD  FFT  Module. 
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p 


(a  simple  curve)  be  described  by  the  interval  coordinates  x^,  x^,  . . . , Xj^ 
(see  Figure  44a).  (Actually,  they  are  given  as  the  left  and  right  object 
intervals,  from  which  the  Xs  can  be  derived.) 


Let  a arc  length  between  x^^  and  x^^^^  = | 


= angle  between  the  vector  the  horizontal 

(measured  as  shown  in  Figure  44) 
k 

and  X = E AX. 

1=1 

Then  we  need  to  derive  the  list  {(Gj,  Xj)  (02»  Jt2^'  ( 0]^*  from  x^^,  ...,  x^^. 

Note  that  the  spacing  between  successive  scan  lines  (intervals)  is  always 
1 (unity);  i.e,,  |Ay|  = 1,  Therefore,  Xj^)  uniquely  determine  6j^  and 

A;^  as  follows; 


w 


' * ‘‘k  - *k-H> 


l*k  ■ *k+l 


arc  tan 


I-  ‘ 


1 *k  ■ *k+l 


180° 


*k'  *k+i  ® 

(Xj^e  left,  right),  or 

right,  left) 


*k’  *k+l*  l®ft 


\*  left,  right 


arc  tan 


I-  * 


■ "k+l 


+ 180  Xj^,  Xj^^j  t right  boundary 


0 


Xj^*  right,  left 
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The  next  step  is  to  compute  the  Fourier  transform  of  the  0(i)  curve 

defined  by  the  sequence  (0^,  ....  (9^,  But  this  does  not 

represent  equally  spaced  samples  of  0 as  a function  of  i (see  Figure  45). 

We  also  need  to  normalize  the  X axis  (arc  length)  to  a constant  perimeter. 

Say  that  total  perimeter  = 512  (a  512-point  DFT  can  then  be  used).  As 

we  see  in  Figure  45,  these  0'  = 0(nAX)  are  determined  from  the  list 

n 

(0.,  1.)  by  replication  (A^  = Xj^/512). 


Once  0'  i * 1,  ....  512  are  determined,  we  need  to  arrive  at  cp.  as  follows* 
1 


*1  H h"  ^2  ^3  ^4 


At 

Figure  45.  Derivation  of  the  Equlspaced  Values  of  0(jt)  as  a Function  of  x, 
from  the  Set  ((0j,Xj).  ....  (6^^. 
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Note  that  if  we  had  a circle  to  begin  with,  then  all  the  » 0.  This  is 
done  to  assure  that  the  dynamic  range  of  the  FFT  is  more  fully  utilized. 
Thus,  measure  the  departure  of  the  object  shape  from  the  circle. 

Let  $(n)n  = 1,  . . . , 512  be  the  complex  FFT  of  the  ^^s  obtained  above. 


The  amplitude  coefficients  are  given  by 


A(n) 

and  phase 
af(n) 


■V 


[Re#(n)]^  + [Im  #(n)]^ 


arc  tan 


Im  t(n) 
Re  f(n) 


Properties  of  the  Fourier  Descriptors --These  properties  are  discussed 
at  length  by  Zahn.  For  our  purposes,  we  note  that  the  amplitudes  A., 
except  E)C  term  (i=l),  are 

• Size  invariant  (because  of  the  built-in  scaling);  and 

• Rotation  invariant  (independent  of  the  starting  point  on 
the  curve) 

The  phase  angles  or^  of  the  Fourier  coefficients  are  not  rotation  invariant. 
They  are  a function  of  the  starting  point  on  the  curve.  Therefore,  we 
include  the  phases  and  the  DC  term  as  features  to  be  evaluated. 


This  software  is  operational.  Several  test  images  were  run  through  to 
verify  its  operation.  Figures  46a,  b,  and  c show  the  9(1),  ^(1)  and  the 
Fourier  amplitudes  for  a right  triangle.  The  corresponding  curves 
for  a rotated  right  triangle  are  reproduced  in  Figures  47a,  b,  and  c. 
This  shows  the  invariance  of  the  amplitude  features  to  rotation. 
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Figure  46b.  <p'(l)  Representation  for  the  Right  Triangle 


FOURIER  AMPLITUDE 


JS00 


/A<y 


i000 


700 


000 


^00 


0 


t 

I 

\ 

) 

V 

e 

► 

5 


Figure  46c. 
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Figure  47c.  Fourier  Harmonic  Amplitudes  for  the  Rotated  Right  Triangle 
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SYSTEM  SIMULATION  SOFTWARE 


All  the  components  of  the  PATS  system  simulation  up  to  and  including 
the  feature  ex  tract  ion /analysis  stage  have  been  assembled.  The  outline 
of  this  software  is  shown  in  Figure  48.  The  output  of  the  bin  and 
feature  extraction  program  includes  a magnetic  tape  file  consisting  of  all 
the  bin  coordinates  (boundary  representation),  the  various  elementary 
features  (size,  average  intensity,  edge  and  bright  counts,  length/width, 
etc. ),  and  the  recognition  features  (the  moment  and  Fourier  features). 

These  features  are  combined  in  the  File  Supervisor  program  with  the 
groimd  truth  (class  of  target  or  clutter,  aspect,  size,  etc. ) to  generate 
a feature  tape  to  be  input  to  the  analysis  software.  The  analysis 
software  includes  two-dimensional  scatter  (cluster)  plots  and  histograms 
of  single  features.  We  are  now  interfacing  the  classifier  software  to 
this  output  so  that  discriminant  analysis  can  be  used  to  test  the 
features. 

A sample  set  of  three  FLIR  frames  processed  through  the  entire  simulation 
is  reproduced  in  Figures  49  through  51.  Note  that  the  high  contrast  truck 
in  Figure  49  is  clearly  segmented,  as  is  the  lower  contrast  tank  in  Figure  50. 
The  bins  obtained  from  these  images  were  processed  to  get  the  elementary 
features  and  the  recognition  features.  These  contain  the  three  targets,  as 
well  as  four  clutter  objects  identified  on  the  three  frames. 

We  reproduce  some  of  these  features  measured  on  the  bins  in  Table  3. 
Because  of  space  limitations,  not  all  of  the  features  listed  in  Table  3 are 
represented.  It  is  of  interest  that  the  "edge  count"  feature  is  consistently 
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Figure  48,  PATS  Screening  Simulation  Software  CXitline 


Fi^re  49b.  Results  of  Simulation  Candidate  Object  Outline  Extracted  by  the 
Bin  Generation  Program  from  Intervals  in  Figure  49a 
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Figure  50b.  Results  of  Simulation  Outline  Extracted  by  the  Bin  Generation 
Program  from  the  Intervals  in  Figure  49a 


Brights  Intervals 

Figure  51a.  Results  of  Simulation 
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i.  • 


Figure  51b.  Results  of  Candidate  Object  Outline  Extracted  by  the  Bin 
Generation  Program  from  the  Intervals  in  Figure  51a 


higher  for  the  targets  (as  is  the  "average  contrast").  This,  of  course, 
is  not  a statistically  significant  sample  of  features,  but  serves  to  demon- 
strate the  state  of  progress  in  the  simulation  software  task. 
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SECTION  IV 


INTERFHAME  ANALYSIS 


Tlie  major  purpose  of  our  interframe  analysis  is  twofold.  The  first 
purpose  concerns  noise  reduction  and  the  second  is  related  to  target 
motion.  In  noise  related  analysis  the  goal  is  to  reduce  the  effect  of 
noise  on  classifier  decisions.  This  may  be  done  by  combining  in  some 
suitable  way  the  noise -sensitive  measurements  of  a detected  object 
from  several  successive  or  near  successive  frames.  Specifically,  we 
wish  to  combine  various  feature  values  of  and  classifier  decision 
made  on  each  object  in  a frame.  In  motion  related  analysis  the  goal 
may  be  subdivided  as  follows;  1)  detect  targets  from  their  motion, 

2)  track  the  targets,  and  3)  aid  the  object  extraction  mechanism  by 
predicting  the  position  and  the  local  background  of  a target. 

For  both  of  the  above  purposes  we  need  a frame  to  be  matched  with 
another  frame.  We  have  tried  this  interframe  registration  at  a 
"symbolic"  rather  than  pixel  level.  The  objects  are  first  extracted 
in  the  two  frames  independently.  Some  of  these  extracted  objects  are 
targets  that  may  or  may  not  be  moving;  the  remaining  are  parts  of  the 
background.  Using  these  extracted  objects  we  first  achieve  a coarse 
alignment  of  the  frames.  This  is  followed  by  a match  of  the  "symbols", 
i.e.,  the  objects. 
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PYRAME  ALIGNMENT 


r' 


When  the  sensor  is  in  motion  the  stationary  objects  in  the  frames  will  have 
a relative  displacement  with  respect  to  the  frame  coordinates.  To  find  a 
match  for  an  object  in  a frame,  a search  has  to  be  made  over  all  the 
objects  in  the  other  frame.  The  neighborhood  of  search  can  be  reduced 
if  the  two  frame  coordinate  systems  are  adjusted  with  respect  to  each 
other  to  correct  for  the  sensor  motion.  This  adjustment  will  not  be 
necessary  when  it  is  known  a priori  that  the  sensor  is  stationary.  Such 
frame  alignment,  however,  may  be  needed  for  more  general  input. 

The  frame  alignment  is  based  on  the  assumption  that  most  of  the  objects 
in  the  frame  are  stationary.  The  alignment  is  performed  by  using  the 
locational  information  of  each  object  in  a frame  In  general,  the 
rectification  due  to  sensor  motion  may  require  translation,  rotation,  and 
scale  change  of  a frame.  We  assume  that  the  sensor  motion  between 
the  two  frames  to  be  matched  is  small  enough  so  that  translation  alone 
may  give  adequate  frame  alignment  for  our  purposes.  For  example,  if 
the  frames  are  successive  or  near  successive  the  sensor  motion  may  be 
assumed  to  be  translation  only. 

Three  alternative  methods  of  frame  alignment  were  tried.  The  first 
one,  called  the  translation  histogram  method,  conceptually  works  as 
follows.  The  difference  in  the  coordinates  of  an  object  in  one  frame, 
say  F^,  and  an  object  in  the  other  frame,  say  F^,  is  computed. 

Keeping  the  object  in  F^  fixed,  this  computation  is  repeated  for  every 
object  in  Fj^.  This  process  is  then  repeated  for  all  other  objects  in  F^. 


112 


Every  computed  coordinate  difference  corresponds  to  a frame  translation 
that  will  match  an  object  pair  in  the  two  frames.  A two-dimensional 
histogram  of  all  the  computed  coordinate  differences  is  made.  The  mode 
of  the  histogram  corresponds  to  a frame  translation  that  will  match  the 
largest  number  of  object  pairs  in  the  two  frames.  This  mode  is  estimat<'d 
as  tiie  translation  necessary  for  the  frame  alignment. 

The  second  method  is  called  the  mean  translation  method.  In  this  method 
the  mean  location  of  the  objects  in  a frame  is  computed.  This  is  the 
sample  mean  of  the  coordinates  of  all  the  objects  in  a frame.  This 
computation  is  done  for  both  the  frames,  and  F^.  The  difference 
between  the  two  mean  locations  gives  the  translation  for  the  frame 
alignment.  This  method  assumes  that  the  mean  locations  of  the  objects 
are  matched  after  the  frame  alignment.  The  third  method,  called  the 
moment  translation  method,  is  similar  to  the  mean  translation  method 
except  that  the  mean  location  is  replaced  by  a weighted  mean  location 
for  eac.h  frame.  Each  object  location  is  weighted  by  the  area  of  the 
object.  The  sample  mean  of  these  weighted  coordinates,  normalized 
by  the  total  object  area  in  the  frame,  gives  the  area  moments  of  the 
two  frames.  The  difference  between  the  weighted  moments  of  the  two 
frames  is  the  required  amount  of  translation. 

The  advantage  of  the  mean  translation  method  over  the  translation 
histogram  method  is  reduction  in  computational  complexity.  However, 
a tradeoff  in  accuracy  might  occur  as  a result.  A single  spurious 
object,  or  noise,  can  affect  the  mean  location  of  a frame.  The  moment 
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translation  method  may  bring  back  some  of  the  accuracy  by  giving 
conspicuous  large  objects  more  weight.  If  the  "noise"  objects  are 
relatively  small  then  this  method  may  yield  better  accuracy  than  the 
mean  translation  method. 

A sequence  of  five  frames  from  the  Krebs  FLIR  data  base  was  processed 
by  the  Honeywell  Augmented  Target  Screener  Subsystem  (ATSS).  The 
input  frames  and  the  objects  extracted  by  the  ATSS  are  shown  in 
Figures  52  and  53.  There  does  not  seem  to  be  appreciable  target 
motion  in  this  sequence  of  frames.  The  FLIR  sensor,  however,  is 
nonstationary.  It  may  be  noted  that  in  one  of  the  frames  the  ATSS  failed 
to  segment  the  target  from  the  input  image.  Figure  54a  shows  a 
typical  translation  histogram.  We  have  assumed  that  the  frame-to-frame 
displacement  is  less  than  l/8th  of  the  frame  dimensions  in  each  of  the 
row  and  column  directions.  Consequently,  all  translations  greater 
than  l/l6th  or  less  than  -l/l6th  of  the  frame  dimensions  have  been 
ignored  in  the  histogram.  In  order  to  achieve  strong  and  robust 
modes,  the  histograms  were  smoothed  by  a 3 x 3 block  filter.  The 
filtered  histogram  corresponding  to  Figure  54a  is  shown  in  Figure  54b. 
The  resulting  frame  translations  are  shown  in  Table  4.  This  table 
also  shows  the  corresponding  translations  by  the  moment  and  the 
mean  methods.  F^  is  the  frame  being  aligned  (i.e.,  translated) 
with  the  frame  F^.  The  translation  is  shown  as  (row,  column)  in 
the  table. 
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b 

Figure  52.  The  Five  Input  Frames  for  Interframe  Analysis 
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Figure  52.  The  Five  Input  Frames  for  Interframe  Analysis  (concluded) 
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a.  Frame  #1 

Figure  53.  The  Objects  Extracted  by  ATSS  from  the  Input  Frames 
(T  = target) 
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b.  Frame  #2 

Figure  53,  The  Objects  Extracted  by  ATSS  from  the  Input  Frames 
(T  * target)  (continued) 
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c.  Frame  #3 

Figure  53,  The  Objects  Extracted  by  ATSS  from  the  Input 
Frames  (T  « taii^et)  (continued) 
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d.  Frame  #4 

Figure  53,  The  Objects  Extracted  by  ATSS  from  the  Input 
Frames  (T  ■ target)  (continued) 
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e.  Frame  #5 

Figure  53.  The  Objects  Extracted  by  ATSS  from  the  Input 
Frames  (T  ■ target)  (concluded) 
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Figure  54a.  Translation  Histogram  for  Aligning  Frames  1 and  2 
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Figure  54b.  Filtered  TrensUtion  Histogram  Corresponding  to 
Figure  54a 
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TABLE  4.  FRAME  TRANSLATIONS  BY  THE  THREE  METHODS 


1 

o 

Histogram 

Moment 

Mean 

1 - 2 

(-2,  8) 

(24,  17) 

(31,  28) 

2-3 

(-4,  4) 

(-26,  0) 

(-14.  4) 

3 - 4 

(-2.  2) 

(5,  4) 

(10.  -15) 

4-5 

(10,  0) 

(-9,  -5) 

(-30,  -29) 

The  differences  in  the  translations  due  to  the  three  methods  are  substantial. 
The  following  procedure  was  devised  to  compare  the  accuracies  of  the 
three  methods.  For  each  frame-pair  under  consideration,  a list  of  object 
pairs  that  visually  appeared  to  match  each  other  was  made.  This  was 
treated  as  the  "ground  truth".  After  the  frame  alignments  the  distances 
between  these  ground  truth  object  pairs  were  computed.  These  distances 
were  regarded  as  the  alignment  errors.  The  total  alignment  error  for 
each  frame  pair  and  each  method  is  shown  in  Table  5.  Also  shown  in 
the  table  in  parentheses  are  the  number  of  objects  in  the  ground  truth 
that  were  nearest  neighbors  to  their  matching  pairs  after  alignments. 

From  the  table  it  appears  that  the  histogram  method  is  more  accurate 
than  the  other  two.  Computationally,  however,  the  histogram  method  is 
the  most  costly  of  the  three  methods. 
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TABLE  5.  ALIGNMENT  ERRORS  AND  NEAREST  NEIGHBOR  MATCHES 


F - F 

0 1 

Histogram 

Moment 

Mean 

1 - 2 

54  (100%) 

310  (79%) 

451  (78%) 

2-3 

63  (100%) 

119  (100%) 

71 (100%) 

3"-  4 

50  (100%) 

86  (100%) 

118  (100%) 

4-5 

101  (100%) 

145  (100%) 

362  (86%) 

SYMBOLIC  MATCHING 

A major  task  in  symbolic  matching  is  the  selection  of  a suitable  set  of 

attributes  or  features  of  the  objects  that  should  be  used  in  matching. 

Another  major  task  is  the  matching  procedure  itself.  Tjrpically,  features 

7 8 

that  are  usually  used  in  symbolic  matching  are  ' size,  shape,  color, 
texture,  and  location.  The  speed  restriction  in  real  time  application  may 
allow  only  a few  and  simple  features  to  be  extracted.  Other  considerations 
in  extracting  the  features  are  the  computational  cost  and  the  effectiveness 
of  the  features  for  the  specific  applications  and  image  qualities  in  mind. 

In  the  application  of  noise  suppression  by  interframe  analysis,  the 


Y 

K.  Price  and  D.  R,  Reddy,  "Symbolic  Image  Registration  and  Change 
Detection,"  Proceedings;  Image  Understanding  Workshop,  April  1977, 
pp.  28-31. 

8^ 

B.  Glish,  W,  Kober,  and  G.  Swanlund,  "Image  Registration  Experiments," 
Ibid,  pp.  32-37. 
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selection  of  appropriate  matching  features  is  especially  difficult.  This  is  so 
because  the  features  that  change  from  frame  to  frame  due  to  noise  and  which 
need  to  be  averaged  in  some  manner  may  noi  be  used  as  matching  features. 
The  task  of  selecting  a good  set  of  features  for  our  application  needs  to  be 
investigated  in  detail.  Here  we  shall  demonstrate  the  result  of  using  some 
simple  features  in  object  matching.  The  features  are  the  size  and  the 
location  of  the  object.  These  two  features  are  easily  obtained  in  the  ATSS, 

Our  present  effort  is  confined  to  the  use  of  only  one  feature  at  a time  in 

matching  the  objects.  The  degree  of  mismatch,  which  is  termed  as  the 

cost  of  matching,  may  be  of  two  kinds.  The  first,  the  static  cost,  arises 

due  to  mismatch  in  the  features  of  tlie  two  objects  under  consideration. 

The  second  degree  of  mismatch,  the  dynamic  cost,  is  due  to  mismatch 

g 

or  inconsistency  in  the  interobject  structural  relationships.  In  our 
application  the  static  cost  is  the  absolute  difference  in  the  feature  values 
of  the  two  objects  being  matched.  Since  the  objects,  e.g.,  targets, 
may  be  moving  with  respect  to  each  other,  there  is  no  constraint  on  the 
structural  relationship  in  our  application.  The  only  interobject 
constraint  is  that  no  two  different  objects  in  one  frame  may  be  matched 
with  the  same  object  in  a second  frame.  This  will  dictate  the  dynamic 
cost  in  our  search  procedure.  Specifically,  in  matching  the  ith  object 
in  F with  the  kth  object  in  F and  matching  the  jth  (j  ^ i)  object  in  F 
with  the  mth  object  in  the  dynamic  cost  ~ * if  k “ m 

object  indices  i and  j. 


0 

M.  Fischler  and  R.  Elschlager,  "The  Representation  and  Matching  of 
Pictorial  Structures,"  IEEE  Transactions  on  Computers,  -Vol.  C-22, 
no.  1,  January  1973,  pp.  67-92. 
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An  optimum  matching  procedure  should  minimize  the  total  cost  of  matching 
all  objects  in  a frame.  This  may  be  done  by  computing  all  possible  static 
and  dynamic  costs  and  selecting  the  particular  set  of  object  matches  that 
has  the  lowest  total  cost.  However,  the  storage  and  the  computational 
requirements  are  too  high  for  this  procedure.  If  we  know  the  maximum 
distance  a target  may  have  moved  between  two  frames,  then  we  can 
restrict  our  search  for  a match  to  a neighborhood  of  corresponding 
I size.  In  this  regard  the  frame  alignment  helps  save  search  time 

by  cutting  down  the  neighborhood  size.  Even  then,  the  storage 
and  computational  requirements  for  finding  the  optimum  matches 
for  all  objects  in  the  frame  may  be  very  high.  If  there  are  N objects 
to  be  matched  in  a frame  and  each  object  has  at  least  K objects  in  its 

neighborhood  for  searching,  then  there  are  at  least  N combinations 

» N(N-l) 

to  consider,  each  combination  having  N static  costs  and ^ — dynamic 

costs  to  compute.  The  Linear  Embedding  Algorithm  of  Fischler  and 

9 

Elschlager  is  aimed  at  cutting  down  such  a computational  requirement 
by  trading  it  off  with  the  global  optimality  of  matching.  In  particular, 
the  method  may  fail  to  find  the  globally  optimum  match  if  the  objects 
with  low  indices  in  incur  a high  static  cost  when  matched  with  their 
optimal  object  matches  in  F^.  We  have  adopted  a matching  procedure 
with  similar  suboptimality  but  one  that  is  computationally  more  suited 
for  our  application.  The  procedure  is  independent  of  object  indices  but 
depends  on  the  relative  magnitudes  of  the  static  costs. 

Let  the  ith  object  in  F^  have  objects  in  F^  in  its  neighborhood  of 
search.  We  shall  call  these  K.  object  indices  in  F^  the  possible 
"labels"  of  the  ith  object.  In  our  search  procedure  the  static  costs  for 
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all  possible  labels  are  computed  for  each  of  N objects  in  In  total 

there  will  be  different  static  costs,  where 
N 

» IK.. 
i-1 


Each  of  these  costs  corresponds  to  an  object-label  match.  We 
arrange  these  K.j,  costs  in  increasing  order.  In  case  of  a tie  the  costs 
are  arranged  in  increasing  order  of  the  object  index  i.  We  shall  accept 
at  the  most  N of  these  static  costs  and  corresponding  object-label  pairs. 

In  our  procedure,  the  lowest  of  the  K.^,  costs  is  always  accepted.  We 
then  proceed  to  the  next  higher  cost.  If  the  label  corresponding  to 
this  cost  has  already  been  taken  by  previously  accepted  object-label 
pairs,  then  we  discard  this  object-label  (infinite  dynamic  cost).  If, 
instead,  the  object  corresponding  to  this  cost  has  already  been  taken, 
then  this  object-label  pair  has  a higher  static  cost.  Hence  we  discard 
this  object-label  pair  and  proceed  to  the  next  higher  cost.  If  certain 
cost  did  not  get  discarded  by  the  above  two  methods  then  the  corresponding 
object-label  pair  is  accepted  as  the  next  matching  object-label  pair. 

This  procedure  continues  until  all  the  static  costs  are  exhausted. 

This  algorithm  will  not  give  the  globally  optimum  match  if  the  static 
cost  corresponding  to  an  optimum  object -label  pair  is  higher  than  that 
of  another  object-label  pair  having  the  same  label.  Consider,  for  example, 
two  objects,  A and  B,  being  matched  with  two  labels,  a and  b,  with  the 
following  static  costs; 
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1 


a 

b 

A 

3 

7 

B 

5 

11 

The  object-label  pairs  arranged  in  increasing  order  of  static  cost  are: 

Aa,  Ba,  Ab,  and  Bb.  The  pairs  that  will  get  accepted  are  Aa  and  Bb, 

even  though  the  optimum  pairs  are  Ab  and  Ba.  It  is  possible  that 

several  iterations  of  a similar  procedure  in  some  suitable  manner, 

10 

®-6-.  by  relaxation  labelling,  will  asymptotically  yield  the  global 
optimum  match. 

An  accelerated  version  of  this  ordered  static  cost  method  has  been 
implemented  in  the  XDS9300  computer.  The  previously  mentioned 
sequence  of  Krebs  frames  was  aligned  according  to  pairs  using  the 
translation  histogram  method.  The  result  was  processed  by  the  above 
matching  algorithm  using  location  and  size  as  matching  features. 

Table  6 shows  the  result  with  the  location  feature,  and  Table  7 shows 
the  result  with  the  size  feature.  A label  of  "0"  implies  no  match,  and 
the  number  following  the  # sign  is  the  frame  number  in  the  tables. 
These  interim  results  appear  to  be  excellent.  We  are  currently  also 
studying  the  use  of  other  features. 

Further  study  needs  to  be  done  in  determining  appropriate  features 
for  matching  objects.  The  effectiveness  of  the  matching  procedure 


Rosenfeld,  R.  Hummel,  andS.W.  Zucker,  "Scene  Labelling  by 
Relaxation  Operations, " IEEE  Transactions  on  Systems,  Man  and 
Cybernetics,  Vol.  SMC-6,  1976,  pp.  426-43^. 
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TABLE  6.  SYMBOLIC  MATCHING  BY  LOCATION 


also  needs  to  be  examined  and  evaluated  in  detail  with  data  typical  for  our 
application.  We  are  also  studying  methods  of  speeding  up  the  translation 
histogram  methods. 
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SECTION  V 


PLANS  FOR  NEXT  REPORTING  PERIOD 


During  the  next  quarter  of  the  program  we  will  complete  the  five  month 
Phase  I design  study.  The  tasks  which  will  be  completed  during  that 
period  include:  Task  1. 1,  MODFLIR/PATS  interface  analysis;  Task  1.2, 
algorithm  selection;  Task  1.3,  system  modeling;  and  Task  1.4,  design 
review. 

The  final  algo'rithm  selection  and  performance  evaluation  will  be  made 
using  the  available  thermal  imagery  data  set  described  in  Section  III. 

It  should  be  noted  that  the  available  imagery  essentially  contains  only 
two  (tanks,  and  APCs)  of  the  five  target  classes  for  which  the  automatic 
target  screening  function  is  required.  When  imagery  for  the  other  three 
classes  becomes  available,  the  target  screener  can  be  trained  for  the 
additional  classes.  However,  the  final  selection  of  algorithms  will 
essentially  have  been  made  on  the  presently  available  data. 

We  also  plan  to  complete  during  the  next  quarter  a breadboard  model 
of  the  global  gain  and  brightness  control  circuit  and  its  interface  to  the 
MODFLIR,  and  a breadboard  model  of  the  DC  restoration  circuit. 

With  these  breadboards,  and  that  of  the  already  completed  local  area 
gain /brightness  control,  the  image  enhancement  subsystem  will  be 
completed  except  for  repackaging  into  the  final  configuration. 
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